Home
/
Authors
/
Richard Durbin

Author

Richard Durbin

Other affiliations: Wellcome Trust Sanger Institute, University of Manchester, Wellcome Trust ...read more

Bio: Richard Durbin is an academic researcher from University of Cambridge. The author has contributed to research in topics: Genome & Population. The author has an hindex of 125, co-authored 319 publications receiving 207192 citations. Previous affiliations of Richard Durbin include Wellcome Trust Sanger Institute & University of Manchester.

Topics: Genome, Population, Genomics, Gene, Sequence assembly ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1992
1990
1989
1988
1987
1986
1985
1960
1959

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Comparative Analysis of Noncoding Regions of 77 Orthologous Mouse and Human Gene Pairs

[...]

Niclas Jareborg¹, Ewan Birney, Richard Durbin•Institutions (1)

Wellcome Trust¹

01 Sep 1999-Genome Research

TL;DR: A data set of 77 genomic mouse/human gene pairs has been compiled from the EMBL nucleotide database, and their corresponding features determined, and a new alignment algorithm was developed to cope with the fact that large parts of noncoding sequences are not alignable in a meaningful way because of genetic drift.

...read moreread less

Abstract: A data set of 77 genomic mouse/human gene pairs has been compiled from the EMBL nucleotide database, and their corresponding features determined. This set was used to analyze the degree of conservation of noncoding sequences between mouse and human. A new alignment algorithm was developed to cope with the fact that large parts of noncoding sequences are not alignable in a meaningful way because of genetic drift. This new algorithm, DNA Block Aligner (DBA), finds colinear-conserved blocks that are flanked by nonconserved sequences of varying lengths. The noncoding regions of the data set were aligned with DBA. The proportion of the noncoding regions covered by blocks >60% identical was 36% for upstream regions, 50% for 5' UTRs, 23% for introns, and 56% for 3' UTRs. These blocks of high identity were more or less evenly distributed across the length of the features, except for upstream regions in which the first 100 bp upstream of the transcription start site was covered in up to 70% of the gene pairs. This data set complements earlier sets on the basis of cDNA sequences and will be useful for further comparative studies. [This paper contains supplementary data that can be found at http://www.genome.org [corrected]].

...read moreread less

216 citations

Journal Article•DOI•

The population history of northeastern Siberia since the Pleistocene

[...]

Martin Sikora¹, Vladimir V. Pitulko², Vitor C. Sousa³, Vitor C. Sousa⁴, Vitor C. Sousa⁵, Morten E. Allentoft¹, Lasse Vinner¹, Simon Rasmussen¹, Simon Rasmussen⁶, Ashot Margaryan¹, Peter de Barros Damgaard¹, Constanza de la Fuente Castro⁷, Constanza de la Fuente Castro¹, Gabriel Renaud¹, Melinda A. Yang⁸, Qiaomei Fu⁸, Isabelle Dupanloup, Konstantinos Giampoudakis¹, David Nogués-Bravo¹, Carsten Rahbek¹, Guus Kroonen⁹, Guus Kroonen¹, Michaël Peyrot⁹, Hugh McColl¹, Sergey Vasilyev², Elizaveta Veselovskaya², Margarita Gerasimova², Elena Y. Pavlova¹⁰, Elena Y. Pavlova², Vyacheslav G. Chasnyk, Pavel A. Nikolskiy², Andrei V. Gromov², Valeriy I. Khartanovich², Vyacheslav Moiseyev², P. S. Grebenyuk², Alexander Yu. Fedorchenko², A. I. Lebedintsev², Sergey B. Slobodin², Boris Malyarchuk², Rui Martiniano¹¹, Morten Meldgaard¹, Morten Meldgaard¹², Laura Arppe¹³, Jukka U. Palo¹⁴, Jukka U. Palo¹⁵, Tarja Sundell¹⁴, Kristiina Mannermaa¹⁴, Mikko Putkonen¹⁴, Verner Alexandersen¹, Charlotte Primeau¹, Nurbol Baimukhanov, Ripan S. Malhi¹⁶, Karl-Göran Sjögren¹⁷, Kristian Kristiansen¹⁷, Anna Wessman¹⁸, Anna Wessman¹⁴, Antti Sajantila¹⁴, Marta Mirazón Lahr¹, Marta Mirazón Lahr¹¹, Richard Durbin¹¹, Richard Durbin¹⁹, Rasmus Nielsen¹, Rasmus Nielsen²⁰, David J. Meltzer¹, David J. Meltzer²¹, Laurent Excoffier³, Laurent Excoffier⁴, Eske Willerslev - Show less +64 more•Institutions (21)

University of Copenhagen¹, Russian Academy of Sciences², University of Bern³, Swiss Institute of Bioinformatics⁴, University of Lisbon⁵, Technical University of Denmark⁶, University of Chicago⁷, Chinese Academy of Sciences⁸, Leiden University⁹, Arctic and Antarctic Research Institute¹⁰, University of Cambridge¹¹, University of Greenland¹², American Museum of Natural History¹³, University of Helsinki¹⁴, National Institutes of Health¹⁵, University of Illinois at Urbana–Champaign¹⁶, University of Gothenburg¹⁷, University of Turku¹⁸, Wellcome Trust Sanger Institute¹⁹, University of California, Berkeley²⁰, Southern Methodist University²¹

13 Jun 2019-Nature

TL;DR: Analysis of 34 newly recovered ancient genomes from northeastern Siberia reveal at least three major migration events in the late Pleistocene population history of the region, including an initial peopling by a previously unknown Palaeolithic population of ‘Ancient North Siberians’ and a Holocene migration of other East Asian-related peoples, which generated the mosaic genetic make-up of contemporary peoples.

...read moreread less

Abstract: Northeastern Siberia has been inhabited by humans for more than 40,000 years but its deep population history remains poorly understood. Here we investigate the late Pleistocene population history of northeastern Siberia through analyses of 34 newly recovered ancient genomes that date to between 31,000 and 600 years ago. We document complex population dynamics during this period, including at least three major migration events: an initial peopling by a previously unknown Palaeolithic population of ‘Ancient North Siberians’ who are distantly related to early West Eurasian hunter-gatherers; the arrival of East Asian-related peoples, which gave rise to ‘Ancient Palaeo-Siberians’ who are closely related to contemporary communities from far-northeastern Siberia (such as the Koryaks), as well as Native Americans; and a Holocene migration of other East Asian-related peoples, who we name ‘Neo-Siberians’, and from whom many contemporary Siberians are descended. Each of these population expansions largely replaced the earlier inhabitants, and ultimately generated the mosaic genetic make-up of contemporary peoples who inhabit a vast area across northern Eurasia and the Americas. Analyses of 34 ancient genomes from northeastern Siberia, dating to between 31,000 and 600 years ago, reveal at least three major migration events in the late Pleistocene population history of the region.

...read moreread less

211 citations

Journal Article•DOI•

WormBase: a comprehensive data resource for Caenorhabditis biology and genomics

[...]

Nansheng Chen¹, Todd W. Harris, Igor Antoshechkin, Carol Bastiani, Tamberlyn Bieri, Darin Blasiar, Keith Bradnam, Payan Canaran, Juancarlos Chan, Chao-Kung Chen, Wen J. Chen, Fiona Cunningham, Paul H. Davis, Eimear E. Kenny, Ranjana Kishore, Daniel Lawson, Raymond Lee, Hans-Michael Müller, Cecilia Nakamura, Shraddha Pai, Philip Ozersky, Andrei Petcherski, Anthony Rogers, Aniko Sabo, Erich M. Schwarz, Kimberly Van Auken, Qinghua Wang, Richard Durbin, John Spieth, Paul W. Sternberg, Lincoln Stein - Show less +27 more•Institutions (1)

Cold Spring Harbor Laboratory¹

17 Dec 2004-Nucleic Acids Research

TL;DR: Internally, the database models are restructured to rationalize the representation of genes and to prepare the system to accept the genome sequences of three additional Caenorhabditis species over the coming year.

...read moreread less

Abstract: WormBase (http://www.wormbase.org), the model organism database for information about Caenorhabditis elegans and related nematodes, continues to expand in breadth and depth. Over the past year, WormBase has added multiple large-scale datasets including SAGE, interactome, 3D protein structure datasets and NCBI KOGs. To accommodate this growth, the International WormBase Consortium has improved the user interface by adding new features to aid in navigation, visualization of large-scale datasets, advanced searching and data mining. Internally, we have restructured the database models to rationalize the representation of genes and to prepare the system to accept the genome sequences of three additional Caenorhabditis species over the coming year.

...read moreread less

203 citations

Journal Article•DOI•

Gene-gene and gene-environment interactions detected by transcriptome sequence analysis in twins

[...]

Alfonso Buil¹, Andrew A. Brown², Tuuli Lappalainen¹, Ana Viñuela³, Matthew N. Davies³, Hou-Feng Zheng⁴, J. Brent Richards³, Daniel Glass³, Kerrin S. Small³, Richard Durbin², Tim D. Spector³, Emmanouil T. Dermitzakis¹ - Show less +8 more•Institutions (4)

Swiss Institute of Bioinformatics¹, Wellcome Trust Sanger Institute², King's College London³, McGill University⁴

01 Jan 2015-Nature Genetics

TL;DR: A model where ASE requires genetic variability in cis, a difference in the sequence of both alleles, but where the magnitude of the ASE effect depends on trans genetic and environmental factors that interact with the cis genetic variants is proposed.

...read moreread less

Abstract: Understanding the genetic architecture of gene expression is an intermediate step in understanding the genetic architecture of complex diseases. RNA sequencing technologies have improved the quantification of gene expression and allow measurement of allele-specific expression (ASE). ASE is hypothesized to result from the direct effect of cis regulatory variants, but a proper estimation of the causes of ASE has not been performed thus far. In this study, we take advantage of a sample of twins to measure the relative contributions of genetic and environmental effects to ASE, and we find substantial effects from gene × gene (G×G) and gene × environment (G×E) interactions. We propose a model where ASE requires genetic variability in cis, a difference in the sequence of both alleles, but where the magnitude of the ASE effect depends on trans genetic and environmental factors that interact with the cis genetic variants.

...read moreread less

200 citations

Posted Content•DOI•

Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly

[...]

Valerie A. Schneider¹, Tina A. Graves-Lindsay², Kerstin Howe³, Nathan Bouk¹, Hsiu-Chuan Chen¹, Paul Kitts¹, Terence Murphy¹, Kim D. Pruitt¹, Françoise Thibaud-Nissen¹, Derek Albracht², Robert S. Fulton², Milinn Kremitzki², Vincent Magrini², Chris Markovic², Sean McGrath², Karyn Meltz Steinberg², Kate Auger³, William Chow³, Joanna Collins³, Glenn Harden³, Tim Hubbard⁴, Sarah Pelan³, Jared T. Simpson⁵, Glen Threadgold³, James Torrance³, Jonathan Wood³, Laura Clarke⁶, Sergey Koren¹, Matthew Boitano⁷, Heng Li⁸, Chen-Shan Chin⁷, Adam M. Phillippy¹, Richard Durbin³, Richard K. Wilson², Paul Flicek⁶, Deanna M. Church¹ - Show less +32 more•Institutions (8)

National Institutes of Health¹, University of Washington², Wellcome Trust Sanger Institute³, Queen Mary University of London⁴, Ontario Institute for Cancer Research⁵, European Bioinformatics Institute⁶, Pacific Biosciences⁷, Broad Institute⁸

30 Aug 2016-bioRxiv

TL;DR: It is asserted that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote the understanding of human biology and advance the efforts to improve health.

...read moreread less

Abstract: The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009 and reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that while the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health.

...read moreread less

194 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
…
20
21
22
23
24
25
26
…
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

[...]

Stephen F. Altschul¹, Thomas L. Madden, Alejandro A. Schäffer¹, Jinghui Zhang, Zheng Zhang², Webb Miller², David J. Lipman - Show less +3 more•Institutions (2)

National Institutes of Health¹, Pennsylvania State University²

01 Sep 1997-Nucleic Acids Research

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.

...read moreread less

Abstract: The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSIBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.

...read moreread less

70,111 citations

Journal Article•DOI•

The Sequence Alignment/Map format and SAMtools

[...]

Heng Li¹, Bob Handsaker², Alec Wysoker², T. J. Fennell², Jue Ruan³, Nils Homer², Gabor T. Marth⁴, Gonçalo R. Abecasis², Richard Durbin¹ - Show less +5 more•Institutions (4)

Wellcome Trust Sanger Institute¹, University of California, Los Angeles², Chinese Academy of Sciences³, Boston College⁴

01 Aug 2009-Bioinformatics

TL;DR: SAMtools as discussed by the authors implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments.

...read moreread less

Abstract: Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: [email protected]

...read moreread less

45,957 citations

Journal Article•DOI•

Fast and accurate short read alignment with Burrows–Wheeler transform

[...]

Heng Li¹, Richard Durbin¹•Institutions (1)

Wellcome Trust Sanger Institute¹

01 Jul 2009-Bioinformatics

TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.

...read moreread less

Abstract: Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: [email protected]

...read moreread less

43,862 citations

Journal Article•DOI•

Fiji: an open-source platform for biological-image analysis

[...]

Johannes Schindelin¹, Ignacio Arganda-Carreras², Erwin Frise³, Verena Kaynig⁴, Mark Longair⁴, Tobias Pietzsch¹, Stephan Preibisch¹, Curtis Rueden⁵, Stephan Saalfeld¹, Benjamin Schmid¹, Jean-Yves Tinevez¹, Daniel J. White¹, Volker Hartenstein¹, Kevin W. Eliceiri⁵, Pavel Tomancak¹, Albert Cardona¹ - Show less +12 more•Institutions (5)

Max Planck Society¹, Massachusetts Institute of Technology², Lawrence Berkeley National Laboratory³, ETH Zurich⁴, University of Wisconsin-Madison⁵

01 Jul 2012-Nature Methods

TL;DR: Fiji is a distribution of the popular open-source software ImageJ focused on biological-image analysis that facilitates the transformation of new algorithms into ImageJ plugins that can be shared with end users through an integrated update system.

...read moreread less

Abstract: Fiji is a distribution of the popular open-source software ImageJ focused on biological-image analysis. Fiji uses modern software engineering practices to combine powerful software libraries with a broad range of scripting languages to enable rapid prototyping of image-processing algorithms. Fiji facilitates the transformation of new algorithms into ImageJ plugins that can be shared with end users through an integrated update system. We propose Fiji as a platform for productive collaboration between computer science and biology research communities.

...read moreread less

43,540 citations

Journal Article•DOI•

Trimmomatic: a flexible trimmer for Illumina sequence data

[...]

Anthony Bolger¹, Marc Lohse¹, Bjoern Usadel¹•Institutions (1)

Max Planck Society¹

01 Aug 2014-Bioinformatics

TL;DR: Timmomatic is developed as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data and is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested.

...read moreread less

Abstract: Motivation: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data. Results: The value of NGS read preprocessing is demonstrated for both reference-based and reference-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested. Availability and implementation: Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.php?page=trimmomatic Contact: ed.nehcaa-htwr.1oib@ledasu Supplementary information: Supplementary data are available at Bioinformatics online.

...read moreread less

39,291 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse