scispace - formally typeset
Search or ask a question
Author

John W. Davey

Bio: John W. Davey is an academic researcher from University of York. The author has contributed to research in topics: Heliconius & Heliconius melpomene. The author has an hindex of 25, co-authored 48 publications receiving 7429 citations. Previous affiliations of John W. Davey include University of Cambridge & Ghent University.

Papers
More filters
Journal ArticleDOI
TL;DR: Best practices for several NGS methods for genome-wide genetic marker development and genotyping that use restriction enzyme digestion of target genomes to reduce the complexity of the target.
Abstract: The authors describe the best practices for a growing number of methods that use next-generation sequencing to rapidly discover and assess genetic markers across any genome, with applications from population genomics and quantitative trait locus mapping to marker-assisted selection.

2,231 citations

Journal ArticleDOI
Kanchon K. Dasmahapatra1, James R. Walters2, Adriana D. Briscoe3, John W. Davey, Annabel Whibley, Nicola J. Nadeau2, Aleksey V. Zimin4, Daniel S.T. Hughes5, Laura Ferguson5, Simon H. Martin2, Camilo Salazar6, Camilo Salazar2, James J. Lewis3, Sebastian Adler7, Seung-Joon Ahn8, Dean A. Baker9, Simon W. Baxter2, Nicola Chamberlain10, Ritika Chauhan11, Brian A. Counterman12, Tamas Dalmay11, Lawrence E. Gilbert13, Karl H.J. Gordon14, David G. Heckel8, Heather M. Hines5, Katharina J. Hoff7, Peter W. H. Holland5, Emmanuelle Jacquin-Joly15, Francis M. Jiggins, Robert T. Jones, Durrell D. Kapan16, Durrell D. Kapan17, Paul J. Kersey, Gerardo Lamas, Daniel Lawson, Daniel Mapleson11, Luana S. Maroja18, Arnaud Martin3, Simon Moxon19, William J. Palmer2, Riccardo Papa20, Alexie Papanicolaou14, Yannick Pauchet8, David A. Ray12, Neil Rosser1, Steven L. Salzberg21, Megan A. Supple22, Alison K. Surridge2, Ayşe Tenger-Trolander10, Heiko Vogel8, Paul A. Wilkinson23, Derek Wilson, James A. Yorke4, Furong Yuan3, Alexi Balmuth24, Cathlene Eland, Karim Gharbi, Marian Thomson, Richard A. Gibbs25, Yi Han25, Joy Jayaseelan25, Christie Kovar25, Tittu Mathew25, Donna M. Muzny25, Fiona Ongeri25, Ling-Ling Pu25, Jiaxin Qu25, Rebecca Thornton25, Kim C. Worley25, Yuanqing Wu25, Mauricio Linares26, Mark Blaxter, Richard H. ffrench-Constant27, Mathieu Joron, Marcus R. Kronforst10, Sean P. Mullen28, Robert D. Reed3, Steven E. Scherer25, Stephen Richards25, James Mallet10, James Mallet1, W. Owen McMillan, Chris D. Jiggins2, Chris D. Jiggins6 
05 Jul 2012-Nature
TL;DR: It is inferred that closely related Heliconius species exchange protective colour-pattern genes promiscuously, implying that hybridization has an important role in adaptive radiation.
Abstract: Sequencing of the genome of the butterfly Heliconius melpomene shows that closely related Heliconius species exchange protective colour-pattern genes promiscuously.

1,103 citations

Journal ArticleDOI
TL;DR: Banerjee et al. as mentioned in this paper proposed Restriction-site associated DNA sequencing, a method that samples at reduced complexity across target genomes, promising to deliver high resolution population genomic data-thousands of sequenced markers across many individuals at reasonable costs.
Abstract: Next-generation sequencing technologies are making a substantial impact on many areas of biology, including the analysis of genetic diversity in populations. However, genome-scale population genetic studies have been accessible only to well-funded model systems. Restriction-site associated DNA sequencing, a method that samples at reduced complexity across target genomes, promises to deliver high resolution population genomic data-thousands of sequenced markers across many individuals-for any organism at reasonable costs. It has found application in wild populations and non-traditional study species, and promises to become an important technology for ecological population genomics.

662 citations

Journal ArticleDOI
TL;DR: This special issue on ‘Genotyping-by-Sequencing in Ecological and Conservation Genomics’ represents a diverse set of empirical and theoretical studies that demonstrate both the utility and some of the challenges of GBS in ecological and conservation genomics.
Abstract: The fields of ecological and conservation genetics have developed greatly in recent decades through the use of molecular markers to investigate organisms in their natural habitat and to evaluate the effect of anthropogenic disturbances. However, many of these studies have been limited to narrow regions of the genome, allowing for limited inferences but making it difficult to generalize about the organisms and their evolutionary history. Tremendous advances in sequencing technology over the last decade (i.e. next-generation sequencing; NGS) have led to the ability to sample the genome much more densely and to observe the patterns of genetic variation that result from the full range of evolutionary processes acting across the genome (Allendorf et al. 2010; Stapley et al. 2010; Li et al. 2012). These studies are transforming molecular ecology by making many long-standing questions much more easily accessible in almost any organism. When studying the genetics of wild populations, it is desirable to samples tens, hundreds or even thousands of individuals. While it is now possible to sequence whole genomes for tens of individuals with small genome sizes, the sequencing of hundreds of individuals with large genomes remains prohibitively expensive, particularly where the genome sequence is unknown. Further, for the purpose of many studies, complete genomic sequence data for all individuals would be unnecessary and simply inflate the computational and bioinformatic costs. A major recent advance has been the development of genotyping-by-sequencing (GBS) approaches that allow a targeted fraction of the genome (a reduced representation library) to be sequenced with next-generation technology rather than the entire genome, even in species with little or no previous genomic information and large genomes. The subset of the genome to be sequenced in these GBS approaches may be targeted using restriction enzymes or capture probes or by sequencing the transcriptome (reviewed in Davey et al. 2011). In the future, as sequencing technology and computational and bioinformatic methods develop further, whole-genome resequencing may become the predominant method for ecological and conservation genomics. Currently, reduced representation approaches offer the ability to not only discover genetic variants such as SNPs but also genotype individuals at these newly discovered loci in the same data. This special issue on ‘Genotyping-by-Sequencing in Ecological and Conservation Genomics’ represents a diverse set of empirical and theoretical studies that demonstrate both the utility and some of the challenges of GBS in ecological and conservation genomics. The empirical studies include demonstrations of the utility of GBS for population genomics and association mapping, as well as the development of genomic resources (i.e. large SNP data sets) for target species. The studies also illustrate some of the differences between GBS methods, in particular, aligning paired-end reads to achieve longer consensus sequences in contrast to single-end reads with shorter alignments, and double-digest versus sonication methods to fragment DNA. In addition, several papers describe advanced data pipelines for handling GBS-related sequence data and critically evaluate best practices for GBS methods and potential biases and novel features associated with GBS data. Overall, this compilation of papers emphasizes that GBS has been quickly adopted by the scientific community and is expected to become a common tool for studies in molecular ecology.

505 citations

Journal ArticleDOI
TL;DR: It is found that D is unreliable in this situation as it gives inflated values when effective population size is low, causing D outliers to cluster in genomic regions of reduced diversity, and a related statistic f^d is proposed, a modified version of a statistic originally developed to estimate the genome-wide fraction of admixture.
Abstract: Several methods have been proposed to test for introgression across genomes. One method tests for a genome-wide excess of shared derived alleles between taxa using Patterson's D statistic, but does not establish which loci show such an excess or whether the excess is due to introgression or ancestral population structure. Several recent studies have extended the use of D by applying the statistic to small genomic regions, rather than genome-wide. Here, we use simulations and whole-genome data from Heliconius butterflies to investigate the behavior of D in small genomic regions. We find that D is unreliable in this situation as it gives inflated values when effective population size is low, causing D outliers to cluster in genomic regions of reduced diversity. As an alternative, we propose a related statistic ƒ(d), a modified version of a statistic originally developed to estimate the genome-wide fraction of admixture. ƒ(d) is not subject to the same biases as D, and is better at identifying introgressed loci. Finally, we show that both D and ƒ(d) outliers tend to cluster in regions of low absolute divergence (d(XY)), which can confound a recently proposed test for differentiating introgression from shared ancestral variation at individual loci.

489 citations


Cited by
More filters
Journal Article
Fumio Tajima1
30 Oct 1989-Genomics
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,521 citations

01 Jun 2012
TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).
Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

10,124 citations

01 Aug 2000
TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.
Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.

4,833 citations

Journal ArticleDOI
TL;DR: The expanded population genomics functions in Stacks will make it a useful tool to harness the newest generation of massively parallel genotyping data for ecological and evolutionary genetics.
Abstract: Massively parallel short-read sequencing technologies, coupled with powerful software platforms, are enabling investigators to analyse tens of thousands of genetic markers. This wealth of data is rapidly expanding and allowing biological questions to be addressed with unprecedented scope and precision. The sizes of the data sets are now posing significant data processing and analysis challenges. Here we describe an extension of the Stacks software package to efficiently use genotype-by-sequencing data for studies of populations of organisms. Stacks now produces core population genomic summary statistics and SNP-by-SNP statistical tests. These statistics can be analysed across a reference genome using a smoothed sliding window. Stacks also now provides several output formats for several commonly used downstream analysis packages. The expanded population genomics functions in Stacks will make it a useful tool to harness the newest generation of massively parallel genotyping data for ecological and evolutionary genetics.

2,958 citations

Journal ArticleDOI
31 May 2012-PLOS ONE
TL;DR: This modified RADseq approach requires no prior genomic knowledge and achieves per-site and per-individual costs below that of current SNP chip technology, while requiring similar hands-on time investment, comparable amounts of input DNA, and downstream analysis times on the order of hours.
Abstract: The ability to efficiently and accurately determine genotypes is a keystone technology in modern genetics, crucial to studies ranging from clinical diagnostics, to genotype-phenotype association, to reconstruction of ancestry and the detection of selection. To date, high capacity, low cost genotyping has been largely achieved via ‘‘SNP chip’’ microarray-based platforms which require substantial prior knowledge of both genome sequence and variability, and once designed are suitable only for those targeted variable nucleotide sites. This method introduces substantial ascertainment bias and inherently precludes detection of rare or population-specific variants, a major source of information for both population history and genotypephenotype association. Recent developments in reduced-representation genome sequencing experiments on massively parallel sequencers (commonly referred to as RAD-tag or RADseq) have brought direct sequencing to the problem of population genotyping, but increased cost and procedural and analytical complexity have limited their widespread adoption. Here, we describe a complete laboratory protocol, including a custom combinatorial indexing method, and accompanying software tools to facilitate genotyping across large numbers (hundreds or more) of individuals for a range of markers (hundreds to hundreds of thousands). Our method requires no prior genomic knowledge and achieves per-site and per-individual costs below that of current SNP chip technology, while requiring similar hands-on time investment, comparable amounts of input DNA, and downstream analysis times on the order of hours. Finally, we provide empirical results from the application of this method to both genotyping in a laboratory cross and in wild populations. Because of its flexibility, this modified RADseq approach promises to be applicable to a diversity of biological questions in a wide range of organisms.

2,734 citations