Home
/
Authors
/
Richard Durbin

Author

Richard Durbin

Other affiliations: Wellcome Trust Sanger Institute, University of Manchester, Wellcome Trust ...read more

Bio: Richard Durbin is an academic researcher from University of Cambridge. The author has contributed to research in topics: Genome & Population. The author has an hindex of 125, co-authored 319 publications receiving 207192 citations. Previous affiliations of Richard Durbin include Wellcome Trust Sanger Institute & University of Manchester.

Topics: Genome, Population, Genomics, Gene, Sequence assembly ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1992
1990
1989
1988
1987
1986
1985
1960
1959

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Inferring Selection on Amino Acid Preference in Protein Domains

[...]

Alan M. Moses¹, Richard Durbin¹•Institutions (1)

Wellcome Trust Sanger Institute¹

18 Dec 2008-Molecular Biology and Evolution

TL;DR: It is shown that it is possible to assign preferred and unpreferred states to amino acid changing mutations that occur in protein domains, and that this effect is quantitative, such that there is a correlation between the shift in frequency of preferred alleles and the predicted fitness effect.

...read moreread less

Abstract: Models that explicitly account for the effect of selection on new mutations have been proposed to account for "codon bias" or the excess of "preferred" codons that results from selection for translational efficiency and/or accuracy. In principle, such models can be applied to any mutation that results in a preferred allele, but in most cases, the fitness effect of a specific mutation cannot be predicted. Here we show that it is possible to assign preferred and unpreferred states to amino acid changing mutations that occur in protein domains. We propose that mutations that lead to more common amino acids (at a given position in a domain) can be considered "preferred alleles" just as are synonymous mutations leading to codons for more abundant tRNAs. We use genome-scale polymorphism data to show that alleles for preferred amino acids in protein domains occur at higher frequencies in the population, as has been shown for preferred codons. We show that this effect is quantitative, such that there is a correlation between the shift in frequency of preferred alleles and the predicted fitness effect. As expected, we also observe a reduction in the numbers of polymorphisms and substitutions at more important positions in domains, consistent with stronger selection at those positions. We examine the derived allele frequency distribution and polymorphism to divergence ratios of preferred and unpreferred differences and find evidence for both negative and positive selections acting to maintain protein domains in the human population. Finally, we analyze a model for selection on amino acid preferences in protein domains and find that it is consistent with the quantitative effects that we observe.

...read moreread less

11 citations

Posted Content•DOI•

A haplotype-resolved, de novo genome assembly for the wood tiger moth (Arctia plantaginis) through trio binning

[...]

Eugenie C Yen¹, Shane A. McCarthy², Shane A. McCarthy¹, Juan A. Galarza³, Tomas N Generalovic¹, Sarah Pelan², Petr Nguyen⁴, Petr Nguyen⁵, Joana I. Meier¹, Joana I. Meier⁶, Ian A. Warren¹, Johanna Mappes³, Richard Durbin¹, Richard Durbin², Chris D. Jiggins¹, Chris D. Jiggins⁶ - Show less +12 more•Institutions (6)

University of Cambridge¹, Wellcome Trust Sanger Institute², University of Jyväskylä³, Academy of Sciences of the Czech Republic⁴, Sewanee: The University of the South⁵, St. John's College⁶

02 Mar 2020-bioRxiv

TL;DR: This assembly of the wood tiger moth genome is one of the highest quality genomes available for Lepidoptera, supporting trio binning as a potent strategy for assembling highly heterozygous genomes.

...read moreread less

Abstract: Background Diploid genome assembly is typically impeded by heterozygosity, as it introduces errors when haplotypes are collapsed into a consensus sequence. Trio binning offers an innovative solution which exploits heterozygosity for assembly. Short, parental reads are used to assign parental origin to long reads from their F1 offspring before assembly, enabling complete haplotype resolution. Trio binning could therefore provide an effective strategy for assembling highly heterozygous genomes which are traditionally problematic, such as insect genomes. This includes the wood tiger moth (Arctia plantaginis), which is an evolutionary study system for warning colour polymorphism. Findings We produced a high-quality, haplotype-resolved assembly for Arctia plantaginis through trio binning. We sequenced a same-species family (F1 heterozygosity ∼1.9%) and used parental Illumina reads to bin 99.98% of offspring Pacific Biosciences reads by parental origin, before assembling each haplotype separately and scaffolding with 10X linked-reads. Both assemblies are highly contiguous (mean scaffold N50: 8.2Mb) and complete (mean BUSCO completeness: 97.3%), with complete annotations and 31 chromosomes identified through karyotyping. We employed the assembly to analyse genome-wide population structure and relationships between 40 wild resequenced individuals from five populations across Europe, revealing the Georgian population as the most genetically differentiated with the lowest genetic diversity. Conclusions We present the first invertebrate genome to be assembled via trio binning. This assembly is one of the highest quality genomes available for Lepidoptera, supporting trio binning as a potent strategy for assembling highly heterozygous genomes. Using this assembly, we provide genomic insights into geographic population structure of Arctia plantaginis.

...read moreread less

10 citations

Journal Article•DOI•

Ethical, Legal, and Social Issues in the Earth BioGenome Project

[...]

Jacob S. Sherkow¹, Katharine Barker², Irus Braverman³, Bob Cook-Deegan⁴, Richard Durbin⁵, Carla Easter⁶, Melissa M. Goldstein⁷, Maui Hudson⁸, W. John Kress², Harris A. Lewin⁹, Debra J. H. Mathews¹⁰, Catherine McCarthy¹¹, Ann McCartney⁶, Manuela da Silva¹², Andrew W. Torrance¹³, Henry T. Greely¹⁴ - Show less +12 more•Institutions (14)

University of Illinois at Urbana–Champaign¹, Smithsonian Institution², University at Buffalo³, Arizona State University⁴, University of Cambridge⁵, National Institutes of Health⁶, George Washington University⁷, University of Waikato⁸, University of California, Davis⁹, Johns Hopkins University¹⁰, Wellcome Trust Sanger Institute¹¹, Oswaldo Cruz Foundation¹², University of Kansas¹³, Stanford University¹⁴

25 Mar 2021-Social Science Research Network

TL;DR: The Earth BioGenome Project (EBP) as discussed by the authors is an audacious endeavor to obtain whole genome sequences of representatives from all eukaryotic species on earth, and it also faces complicated ethical, legal, and social issues.

...read moreread less

Abstract: The Earth BioGenome Project (EBP) is an audacious endeavor to obtain whole genome sequences of representatives from all eukaryotic species on earth. In addition to the Project’s technical and organizational challenges, it also faces complicated ethical, legal, and social issues. This paper, from members of the EBP’s Ethical, Legal, and Social Issues (ELSI) Committee, catalogs these ELSI concerns arising from EBP. While we do not— and cannot—provide simple, overarching solutions for all of the issues raised here, we conclude our Perspective by beginning to chart a path forward for EBP’s work.

...read moreread less

10 citations

Journal Article•

Erratum: WormBase: Network access to the genome and biology of Caenorhabditis elegans (Nucleic Acids Research (2001) vol. 29 (82-86))

[...]

Lincoln Stein, Marco Mangone, Erich M. Schwarz, Richard Durbin, Jean Thierry-Mieg, John Spieth, Paul W. Sternberg - Show less +3 more

15 Feb 2001-Nucleic Acids Research

10 citations

Posted Content•DOI•

Differential use of multiple genetic sex determination systems in divergent ecomorphs of an African crater lake cichlid

[...]

Hannah Munby¹, Tyler Linderoth¹, Bettina Fischer¹, Mingliu Du², Mingliu Du¹, Grégoire Vernaz¹, Grégoire Vernaz², Alexandra M. Tyers³, Benjamin P. Ngatunga, Asilatu Shechonge, Hubert Denise¹, Shane A. McCarthy¹, Shane A. McCarthy², Iliana Bista¹, Iliana Bista², Eric A. Miska¹, Eric A. Miska², M. Emília Santos¹, Martin J. Genner⁴, George F. Turner³, Richard Durbin², Richard Durbin¹ - Show less +18 more•Institutions (4)

University of Cambridge¹, Wellcome Trust Sanger Institute², Bangor University³, University of Bristol⁴

06 Aug 2021-bioRxiv

TL;DR: In this paper, the authors looked for sexassociated loci in full genome data from 647 individuals of Astatotilapia calliptera from Lake Masoko, a small isolated crater lake in Tanzania, which contains two distinct ecomorphs of the species.

...read moreread less

Abstract: African cichlid fishes not only exhibit remarkably high rates of speciation but also have some of the fastest evolving sex determination systems in vertebrates. However, little is known empirically in cichlids about the genetic mechanisms generating new sex-determining variants, what forces dictate their fate, the demographic scales at which they evolve, and whether they are related to speciation. To address these questions, we looked for sex-associated loci in full genome data from 647 individuals of Astatotilapia calliptera from Lake Masoko, a small isolated crater lake in Tanzania, which contains two distinct ecomorphs of the species. We identified three separate XY systems on recombining chromosomes. Two Y alleles derive from mutations that increase expression of the gonadal soma-derived factor gene (gsdf) on chromosome 7; the first is a tandem duplication of the entire gene observed throughout much of the Lake Malawi haplochromine cichlid radiation to which A. calliptera belongs, and the second is a 5 kb insertion directly upstream of gsdf. Both the latter variant and another 700 bp insertion on chromosome 19 responsible for the third Y allele arose from transposable element insertions. Males belonging to the Masoko deep-water benthic ecomorph are determined exclusively by the gsdf duplication, whereas all three Y alleles are used in the Masoko littoral ecomorph, in which they appear to act antagonistically among males with different amounts of benthic admixture. This antagonism in the face of ongoing admixture may be important for sustaining multifactorial sex determination in Lake Masoko. In addition to identifying the molecular basis of three coexisting sex determining alleles, these results demonstrate that genetic interactions between Y alleles and genetic background can potentially affect fitness and adaptive evolution.

...read moreread less

10 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
…
51
52
53
54
55
56
57
…
58
59
60
61
62
63
64
65
66
67
68

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

[...]

Stephen F. Altschul¹, Thomas L. Madden, Alejandro A. Schäffer¹, Jinghui Zhang, Zheng Zhang², Webb Miller², David J. Lipman - Show less +3 more•Institutions (2)

National Institutes of Health¹, Pennsylvania State University²

01 Sep 1997-Nucleic Acids Research

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.

...read moreread less

Abstract: The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSIBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.

...read moreread less

70,111 citations

Journal Article•DOI•

The Sequence Alignment/Map format and SAMtools

[...]

Heng Li¹, Bob Handsaker², Alec Wysoker², T. J. Fennell², Jue Ruan³, Nils Homer², Gabor T. Marth⁴, Gonçalo R. Abecasis², Richard Durbin¹ - Show less +5 more•Institutions (4)

Wellcome Trust Sanger Institute¹, University of California, Los Angeles², Chinese Academy of Sciences³, Boston College⁴

01 Aug 2009-Bioinformatics

TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.

...read moreread less

Abstract: Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: [email protected]

...read moreread less

45,957 citations

Journal Article•DOI•

Fast and accurate short read alignment with Burrows–Wheeler transform

[...]

Heng Li¹, Richard Durbin¹•Institutions (1)

Wellcome Trust Sanger Institute¹

01 Jul 2009-Bioinformatics

TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.

...read moreread less

Abstract: Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: [email protected]

...read moreread less

43,862 citations

Journal Article•DOI•

Fiji: an open-source platform for biological-image analysis

[...]

Johannes Schindelin¹, Ignacio Arganda-Carreras², Erwin Frise³, Verena Kaynig⁴, Mark Longair⁴, Tobias Pietzsch¹, Stephan Preibisch¹, Curtis Rueden⁵, Stephan Saalfeld¹, Benjamin Schmid¹, Jean-Yves Tinevez¹, Daniel J. White¹, Volker Hartenstein¹, Kevin W. Eliceiri⁵, Pavel Tomancak¹, Albert Cardona¹ - Show less +12 more•Institutions (5)

Max Planck Society¹, Massachusetts Institute of Technology², Lawrence Berkeley National Laboratory³, ETH Zurich⁴, University of Wisconsin-Madison⁵

01 Jul 2012-Nature Methods

TL;DR: Fiji is a distribution of the popular open-source software ImageJ focused on biological-image analysis that facilitates the transformation of new algorithms into ImageJ plugins that can be shared with end users through an integrated update system.

...read moreread less

Abstract: Fiji is a distribution of the popular open-source software ImageJ focused on biological-image analysis. Fiji uses modern software engineering practices to combine powerful software libraries with a broad range of scripting languages to enable rapid prototyping of image-processing algorithms. Fiji facilitates the transformation of new algorithms into ImageJ plugins that can be shared with end users through an integrated update system. We propose Fiji as a platform for productive collaboration between computer science and biology research communities.

...read moreread less

43,540 citations

Journal Article•DOI•

Trimmomatic: a flexible trimmer for Illumina sequence data

[...]

Anthony Bolger¹, Marc Lohse¹, Bjoern Usadel¹•Institutions (1)

Max Planck Society¹

01 Aug 2014-Bioinformatics

TL;DR: Timmomatic is developed as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data and is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested.

...read moreread less

Abstract: Motivation: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data. Results: The value of NGS read preprocessing is demonstrated for both reference-based and reference-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested. Availability and implementation: Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.php?page=trimmomatic Contact: ed.nehcaa-htwr.1oib@ledasu Supplementary information: Supplementary data are available at Bioinformatics online.

...read moreread less

39,291 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse