scispace - formally typeset
Search or ask a question
Author

Harold Watson

Bio: Harold Watson is an academic researcher from University of the West Indies. The author has contributed to research in topics: Genome-wide association study & Single-nucleotide polymorphism. The author has an hindex of 11, co-authored 24 publications receiving 12222 citations.

Papers
More filters
Journal ArticleDOI
Adam Auton1, Gonçalo R. Abecasis2, David Altshuler3, Richard Durbin4  +514 moreInstitutions (90)
01 Oct 2015-Nature
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

12,661 citations

01 Oct 2015
TL;DR: The 1000 Genomes Project as mentioned in this paper provided a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and reported the completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole genome sequencing, deep exome sequencing and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

3,247 citations

Journal ArticleDOI
Dara G. Torgerson1, Dara G. Torgerson2, Elizabeth J. Ampleford3, Grace Y. Chiu4, W. James Gauderman5, Christopher R. Gignoux6, Penelope E. Graves7, Blanca E. Himes8, Albert M. Levin9, Rasika A. Mathias10, Dana B. Hancock1, Dana B. Hancock6, Dana B. Hancock11, James W. Baurley5, Celeste Eng6, Debra A. Stern7, Juan C. Celedón12, Nicholas Rafaels10, Daniel Capurso5, David V. Conti6, Lindsey A. Roth, Manuel Soto-Quiros10, Alkis Togias3, Xingnan Li2, Rachel A. Myers, Isabelle Romieu5, Isabelle Romieu13, David Van Den Berg6, Donglei Hu10, Nadia N. Hansel6, Ryan D. Hernandez8, Elliott Israel5, Muhammad T. Salam6, Joshua Galanter14, Pedro C. Avila, Lydiana Avila, Jose R. Rodriquez-Santana, R. Chapela15, William Rodríguez-Cintrón10, Gregory B. Diette10, N. Franklin Adkinson10, Rebekah A. Abel2, K. Ross2, Min Shi11, Mezbah U. Faruque16, Georgia M. Dunston16, Harold Watson17, Vito J. Mantese10, Serpil C. Ezurum18, Liming Liang8, Ingo Ruczinski10, Jean G. Ford10, Scott Huntsman6, Kian Fan Chung19, Hita Vora5, Xia Li5, William J. Calhoun20, Mario Castro21, Juan José Luis Sienra-Monge, Blanca Estela Del Río-Navarro, Klaus A. Deichmann22, Andrea Heinzmann22, Sally E. Wenzel22, William W. Busse23, William W. Busse12, James E. Gern23, Robert F. Lemanske23, Terri H. Beaty10, Eugene R. Bleecker3, Benjamin A. Raby8, Deborah A. Meyers3, Stephanie J. London10, Frank D. Gilliland5, Esteban G. Burchard6, Fernando D. Martinez7, Scott T. Weiss8, L. Keoki Williams9, Kathleen C. Barnes10, Carole Ober2, Dan L. Nicolae2 
TL;DR: The results suggest that some asthma susceptibility loci are robust to differences in ancestry when sufficiently large samples sizes are investigated, and that ancestry-specific associations also contribute to the complex genetic architecture of asthma.
Abstract: Asthma is a common disease with a complex risk architecture including both genetic and environmental factors. We performed a meta-analysis of North American genome-wide association studies of asthma in 5,416 individuals with asthma (cases) including individuals of European American, African American or African Caribbean, and Latino ancestry, with replication in an additional 12,649 individuals from the same ethnic groups. We identified five susceptibility loci. Four were at previously reported loci on 17q21, near IL1RL1, TSLP and IL33, but we report for the first time, to our knowledge, that these loci are associated with asthma risk in three ethnic groups. In addition, we identified a new asthma susceptibility locus at PYHIN1, with the association being specific to individuals of African descent (P = 3.9 × 10(-9)). These results suggest that some asthma susceptibility loci are robust to differences in ancestry when sufficiently large samples sizes are investigated, and that ancestry-specific associations also contribute to the complex genetic architecture of asthma.

773 citations

Journal ArticleDOI
TL;DR: A deeply sequenced dataset of 910 individuals, all of African descent, is used to construct a set of DNA sequences that is present in these individuals but missing from the reference human genome, demonstrating that the African pan-genome contains ~10% more DNA than the current human reference genome.
Abstract: We used a deeply sequenced dataset of 910 individuals, all of African descent, to construct a set of DNA sequences that is present in these individuals but missing from the reference human genome. We aligned 1.19 trillion reads from the 910 individuals to the reference genome (GRCh38), collected all reads that failed to align, and assembled these reads into contiguous sequences (contigs). We then compared all contigs to one another to identify a set of unique sequences representing regions of the African pan-genome missing from the reference genome. Our analysis revealed 296,485,284 bp in 125,715 distinct contigs present in the populations of African descent, demonstrating that the African pan-genome contains ~10% more DNA than the current human reference genome. Although the functional significance of nearly all of this sequence is unknown, 387 of the novel contigs fall within 315 distinct protein-coding genes, and the rest appear to be intergenic. Assembly of a pan-genome from 910 humans of African descent identifies 296.5 Mb of novel DNA mapping to 125,715 distinct contigs. This African pan-genome contains ~10% more DNA than the current human reference genome.

262 citations

Journal ArticleDOI
TL;DR: This study illustrates the complexity of identifying true associations for a complex and heterogeneous disease, such as asthma, in admixed populations, especially populations of African descent.
Abstract: Background Asthma is a complex disease characterized by striking ethnic disparities not explained entirely by environmental, social, cultural, or economic factors. Of the limited genetic studies performed on populations of African descent, notable differences in susceptibility allele frequencies have been observed. Objectives We sought to test the hypothesis that some genes might contribute to the profound disparities in asthma. Methods We performed a genome-wide association study in 2 independent populations of African ancestry (935 African American asthmatic cases and control subjects from the Baltimore–Washington, DC, area and 929 African Caribbean asthmatic subjects and their family members from Barbados) to identify single-nucleotide polymorphisms (SNPs) associated with asthma. Results A meta-analysis combining these 2 African-ancestry populations yielded 3 SNPs with a combined P value of less than 10 −5 in genes of potential biologic relevance to asthma and allergic disease: rs10515807, mapping to the α-1B-adrenergic receptor (ADRA1B) gene on chromosome 5q33 (3.57 × 10 −6 ); rs6052761, mapping to the prion-related protein (PRNP) gene on chromosome 20pter-p12 (2.27 × 10 −6 ); and rs1435879, mapping to the dipeptidyl peptidase 10 (DPP10) gene on chromosome 2q12.3-q14.2. The generalizability of these findings was tested in family and case-control panels of United Kingdom and German origin, respectively, but none of the associations observed in the African groups were replicated in these European studies. Evidence for association was also examined in 4 additional case-control studies of African Americans; however, none of the SNPs implicated in the discovery population were replicated. Conclusions This study illustrates the complexity of identifying true associations for a complex and heterogeneous disease, such as asthma, in admixed populations, especially populations of African descent.

200 citations


Cited by
More filters
Journal ArticleDOI
Monkol Lek, Konrad J. Karczewski1, Konrad J. Karczewski2, Eric Vallabh Minikel2, Eric Vallabh Minikel1, Kaitlin E. Samocha, Eric Banks2, Timothy Fennell2, Anne H. O’Donnell-Luria2, Anne H. O’Donnell-Luria3, Anne H. O’Donnell-Luria1, James S. Ware, Andrew J. Hill1, Andrew J. Hill2, Andrew J. Hill4, Beryl B. Cummings2, Beryl B. Cummings1, Taru Tukiainen1, Taru Tukiainen2, Daniel P. Birnbaum2, Jack A. Kosmicki, Laramie E. Duncan2, Laramie E. Duncan1, Karol Estrada1, Karol Estrada2, Fengmei Zhao1, Fengmei Zhao2, James Zou2, Emma Pierce-Hoffman2, Emma Pierce-Hoffman1, Joanne Berghout5, David Neil Cooper6, Nicole A. Deflaux7, Mark A. DePristo2, Ron Do, Jason Flannick2, Jason Flannick1, Menachem Fromer, Laura D. Gauthier2, Jackie Goldstein2, Jackie Goldstein1, Namrata Gupta2, Daniel P. Howrigan2, Daniel P. Howrigan1, Adam Kiezun2, Mitja I. Kurki2, Mitja I. Kurki1, Ami Levy Moonshine2, Pradeep Natarajan, Lorena Orozco, Gina M. Peloso2, Gina M. Peloso1, Ryan Poplin2, Manuel A. Rivas2, Valentin Ruano-Rubio2, Samuel A. Rose2, Douglas M. Ruderfer8, Khalid Shakir2, Peter D. Stenson6, Christine Stevens2, Brett Thomas1, Brett Thomas2, Grace Tiao2, María Teresa Tusié-Luna, Ben Weisburd2, Hong-Hee Won9, Dongmei Yu, David Altshuler2, David Altshuler10, Diego Ardissino, Michael Boehnke11, John Danesh12, Stacey Donnelly2, Roberto Elosua, Jose C. Florez1, Jose C. Florez2, Stacey Gabriel2, Gad Getz1, Gad Getz2, Stephen J. Glatt13, Christina M. Hultman14, Sekar Kathiresan, Markku Laakso15, Steven A. McCarroll2, Steven A. McCarroll1, Mark I. McCarthy16, Mark I. McCarthy17, Dermot P.B. McGovern18, Ruth McPherson19, Benjamin M. Neale2, Benjamin M. Neale1, Aarno Palotie, Shaun Purcell8, Danish Saleheen20, Jeremiah M. Scharf, Pamela Sklar, Patrick F. Sullivan21, Patrick F. Sullivan14, Jaakko Tuomilehto22, Ming T. Tsuang23, Hugh Watkins16, Hugh Watkins17, James G. Wilson24, Mark J. Daly1, Mark J. Daly2, Daniel G. MacArthur1, Daniel G. MacArthur2 
18 Aug 2016-Nature
TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.
Abstract: Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

8,758 citations

Journal ArticleDOI
11 Oct 2018-Nature
TL;DR: Deep phenotype and genome-wide genetic data from 500,000 individuals from the UK Biobank is described, describing population structure and relatedness in the cohort, and imputation to increase the number of testable variants to 96 million.
Abstract: The UK Biobank project is a prospective cohort study with deep genetic and phenotypic data collected on approximately 500,000 individuals from across the United Kingdom, aged between 40 and 69 at recruitment. The open resource is unique in its size and scope. A rich variety of phenotypic and health-related information is available on each participant, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Follow-up information is provided by linking health and medical records. Genome-wide genotype data have been collected on all participants, providing many opportunities for the discovery of new genetic associations and the genetic bases of complex traits. Here we describe the centralized analysis of the genetic data, including genotype quality, properties of population structure and relatedness of the genetic data, and efficient phasing and genotype imputation that increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen genes was imputed, resulting in the recovery of signals with known associations between human leukocyte antigen alleles and many diseases.

4,489 citations

Journal ArticleDOI
12 Oct 2017-Nature
TL;DR: It is found that local genetic variation affects gene expression levels for the majority of genes, and inter-chromosomal genetic effects for 93 genes and 112 loci are identified, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.
Abstract: Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.

3,289 citations

Journal ArticleDOI
TL;DR: These and other strategies are providing researchers and clinicians a variety of tools to probe genomes in greater depth, leading to an enhanced understanding of how genome sequence variants underlie phenotype and disease.
Abstract: Since the completion of the human genome project in 2003, extraordinary progress has been made in genome sequencing technologies, which has led to a decreased cost per megabase and an increase in the number and diversity of sequenced genomes. An astonishing complexity of genome architecture has been revealed, bringing these sequencing technologies to even greater advancements. Some approaches maximize the number of bases sequenced in the least amount of time, generating a wealth of data that can be used to understand increasingly complex phenotypes. Alternatively, other approaches now aim to sequence longer contiguous pieces of DNA, which are essential for resolving structurally complex regions. These and other strategies are providing researchers and clinicians a variety of tools to probe genomes in greater depth, leading to an enhanced understanding of how genome sequence variants underlie phenotype and disease.

3,096 citations

Journal ArticleDOI
TL;DR: Improved data access is improved with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database.
Abstract: The GWAS Catalog delivers a high-quality curated collection of all published genome-wide association studies enabling investigations to identify causal variants, understand disease mechanisms, and establish targets for novel therapies. The scope of the Catalog has also expanded to targeted and exome arrays with 1000 new associations added for these technologies. As of September 2018, the Catalog contains 5687 GWAS comprising 71673 variant-trait associations from 3567 publications. New content includes 284 full P-value summary statistics datasets for genome-wide and new targeted array studies, representing 6 × 109 individual variant-trait statistics. In the last 12 months, the Catalog's user interface was accessed by ∼90000 unique users who viewed >1 million pages. We have improved data access with the release of a new RESTful API to support high-throughput programmatic access, an improved web interface and a new summary statistics database. Summary statistics provision is supported by a new format proposed as a community standard for summary statistics data representation. This format was derived from our experience in standardizing heterogeneous submissions, mapping formats and in harmonizing content. Availability: https://www.ebi.ac.uk/gwas/.

2,878 citations