scispace - formally typeset
Search or ask a question

Showing papers on "Exome sequencing published in 2019"


Posted ContentDOI
Konrad J. Karczewski1, Konrad J. Karczewski2, Laurent C. Francioli1, Laurent C. Francioli2, Grace Tiao2, Grace Tiao1, Beryl B. Cummings1, Beryl B. Cummings2, Jessica Alföldi1, Jessica Alföldi2, Qingbo Wang2, Qingbo Wang1, Ryan L. Collins2, Ryan L. Collins1, Kristen M. Laricchia2, Kristen M. Laricchia1, Andrea Ganna2, Andrea Ganna3, Andrea Ganna1, Daniel P. Birnbaum1, Laura D. Gauthier1, Harrison Brand1, Harrison Brand2, Matthew Solomonson1, Matthew Solomonson2, Nicholas A. Watts1, Nicholas A. Watts2, Daniel R. Rhodes4, Moriel Singer-Berk1, Eleanor G. Seaby1, Eleanor G. Seaby2, Jack A. Kosmicki1, Jack A. Kosmicki2, Raymond K. Walters1, Raymond K. Walters2, Katherine Tashman2, Katherine Tashman1, Yossi Farjoun1, Eric Banks1, Timothy Poterba2, Timothy Poterba1, Arcturus Wang1, Arcturus Wang2, Cotton Seed2, Cotton Seed1, Nicola Whiffin5, Nicola Whiffin1, Jessica X. Chong6, Kaitlin E. Samocha7, Emma Pierce-Hoffman1, Zachary Zappala1, Zachary Zappala8, Anne H. O’Donnell-Luria2, Anne H. O’Donnell-Luria9, Anne H. O’Donnell-Luria1, Eric Vallabh Minikel1, Ben Weisburd1, Monkol Lek1, Monkol Lek10, James S. Ware1, James S. Ware5, Christopher Vittal2, Christopher Vittal1, Irina M. Armean11, Irina M. Armean2, Irina M. Armean1, Louis Bergelson1, Kristian Cibulskis1, Kristen M. Connolly1, Miguel Covarrubias1, Stacey Donnelly1, Steven Ferriera1, Stacey Gabriel1, Jeff Gentry1, Namrata Gupta1, Thibault Jeandet1, Diane Kaplan1, Christopher Llanwarne1, Ruchi Munshi1, Sam Novod1, Nikelle Petrillo1, David Roazen1, Valentin Ruano-Rubio1, Andrea Saltzman1, Molly Schleicher1, Jose Soto1, Kathleen Tibbetts1, Charlotte Tolonen1, Gordon Wade1, Michael E. Talkowski2, Michael E. Talkowski1, Benjamin M. Neale1, Benjamin M. Neale2, Mark J. Daly1, Daniel G. MacArthur1, Daniel G. MacArthur2 
30 Jan 2019-bioRxiv
TL;DR: Using an improved human mutation rate model, human protein-coding genes are classified along a spectrum representing tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve gene discovery power for both common and rare diseases.
Abstract: Summary Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes critical for an organism’s function will be depleted for such variants in natural populations, while non-essential genes will tolerate their accumulation. However, predicted loss-of-function (pLoF) variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes. Here, we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence pLoF variants in this cohort after filtering for sequencing and annotation artifacts. Using an improved model of human mutation, we classify human protein-coding genes along a spectrum representing intolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve gene discovery power for both common and rare diseases.

1,128 citations


Journal ArticleDOI
F. Kyle Satterstrom1, Jack A. Kosmicki1, Jiebiao Wang2, Michael S. Breen3  +150 moreInstitutions (45)
TL;DR: Using an enhanced Bayesian framework to integrate de novo and case-control rare variation, 102 risk genes are identified at a false discovery rate of ≤ 0.1, consistent with multiple paths to an excitatory/inhibitory imbalance underlying ASD.
Abstract: We present the largest exome sequencing study of autism spectrum disorder (ASD) to date (n=35,584 total samples, 11,986 with ASD). Using an enhanced Bayesian framework to integrate de novo and case-control rare variation, we identify 102 risk genes at a false discovery rate ≤ 0.1. Of these genes, 49 show higher frequencies of disruptive de novo variants in individuals ascertained for severe neurodevelopmental delay, while 53 show higher frequencies in individuals ascertained for ASD; comparing ASD cases with mutations in these groups reveals phenotypic differences. Expressed early in brain development, most of the risk genes have roles in regulation of gene expression or neuronal communication (i.e., mutations effect neurodevelopmental and neurophysiological changes), and 13 fall within loci recurrently hit by copy number variants. In human cortex single-cell gene expression data, expression of risk genes is enriched in both excitatory and inhibitory neuronal lineages, consistent with multiple paths to an excitatory/inhibitory imbalance underlying ASD.

461 citations


Journal ArticleDOI
TL;DR: Exome sequencing in a combined cohort of more than 3000 patients with chronic kidney disease yielded a genetic diagnosis in just under 10% of cases, with genetic findings for medically actionable disorders that would also lead to subspecialty referral and inform renal management.
Abstract: Background Exome sequencing is emerging as a first-line diagnostic method in some clinical disciplines, but its usefulness has yet to be examined for most constitutional disorders in adult...

409 citations


Journal ArticleDOI
TL;DR: WES improved the identification of genetic disorders in fetuses with structural abnormalities; however, before clinical implementation, careful consideration should be given to case selection to maximise clinical usefulness.

372 citations


Journal ArticleDOI
23 Feb 2019
TL;DR: In this article, the authors used whole-exome sequencing (WES) to evaluate the presence of genetic variants in developmental disorder genes (diagnostic genetic variants) in a cohort of fetuses with structural anomalies and samples from their parents.
Abstract: Background Fetal structural anomalies, which are detected by ultrasonography, have a range of genetic causes, including chromosomal aneuploidy, copy number variations (CNVs; which are detectable by chromosomal microarrays), and pathogenic sequence variants in developmental genes. Testing for aneuploidy and CNVs is routine during the investigation of fetal structural anomalies, but there is little information on the clinical usefulness of genome-wide next-generation sequencing in the prenatal setting. We therefore aimed to evaluate the proportion of fetuses with structural abnormalities that had identifiable variants in genes associated with developmental disorders when assessed with whole-exome sequencing (WES). Methods In this prospective cohort study, two groups in Birmingham and London recruited patients from 34 fetal medicine units in England and Scotland. We used whole-exome sequencing (WES) to evaluate the presence of genetic variants in developmental disorder genes (diagnostic genetic variants) in a cohort of fetuses with structural anomalies and samples from their parents, after exclusion of aneuploidy and large CNVs. Women were eligible for inclusion if they were undergoing invasive testing for identified nuchal translucency or structural anomalies in their fetus, as detected by ultrasound after 11 weeks of gestation. The partners of these women also had to consent to participate. Sequencing results were interpreted with a targeted virtual gene panel for developmental disorders that comprised 1628 genes. Genetic results related to fetal structural anomaly phenotypes were then validated and reported postnatally. The primary endpoint, which was assessed in all fetuses, was the detection of diagnostic genetic variants considered to have caused the fetal developmental anomaly. Findings The cohort was recruited between Oct 22, 2014, and June 29, 2017, and clinical data were collected until March 31, 2018. After exclusion of fetuses with aneuploidy and CNVs, 610 fetuses with structural anomalies and 1202 matched parental samples (analysed as 596 fetus-parental trios, including two sets of twins, and 14 fetus-parent dyads) were analysed by WES. After bioinformatic filtering and prioritisation according to allele frequency and effect on protein and inheritance pattern, 321 genetic variants (representing 255 potential diagnoses) were selected as potentially pathogenic genetic variants (diagnostic genetic variants), and these variants were reviewed by a multidisciplinary clinical review panel. A diagnostic genetic variant was identified in 52 (8·5%; 95% CI 6·4–11·0) of 610 fetuses assessed and an additional 24 (3·9%) fetuses had a variant of uncertain significance that had potential clinical usefulness. Detection of diagnostic genetic variants enabled us to distinguish between syndromic and non-syndromic fetal anomalies (eg, congenital heart disease only vs a syndrome with congenital heart disease and learning disability). Diagnostic genetic variants were present in 22 (15·4%) of 143 fetuses with multisystem anomalies (ie, more than one fetal structural anomaly), nine (11·1%) of 81 fetuses with cardiac anomalies, and ten (15·4%) of 65 fetuses with skeletal anomalies; these phenotypes were most commonly associated with diagnostic variants. However, diagnostic genetic variants were least common in fetuses with isolated increased nuchal translucency (≥4·0 mm) in the first trimester (in three [3·2%] of 93 fetuses). Interpretation WES facilitates genetic diagnosis of fetal structural anomalies, which enables more accurate predictions of fetal prognosis and risk of recurrence in future pregnancies. However, the overall detection of diagnostic genetic variants in a prospectively ascertained cohort with a broad range of fetal structural anomalies is lower than that suggested by previous smaller-scale studies of fewer phenotypes. WES improved the identification of genetic disorders in fetuses with structural abnormalities; however, before clinical implementation, careful consideration should be given to case selection to maximise clinical usefulness. Funding UK Department of Health and Social Care and The Wellcome Trust.

275 citations


Journal ArticleDOI
06 Jun 2019-Nature
TL;DR: Exome-sequencing analyses of 20,791 individuals with type 2 diabetes and non-diabetic control participants from 5 ancestries identify gene-level associations of rare variants in 4 genes at exome-wide significance and propose a method to interpret these modest rare-variant associations and incorporate these associations into future target or gene prioritization efforts.
Abstract: Protein-coding genetic variants that strongly affect disease risk can yield relevant clues to disease pathogenesis. Here we report exome-sequencing analyses of 20,791 individuals with type 2 diabetes (T2D) and 24,440 non-diabetic control participants from 5 ancestries. We identify gene-level associations of rare variants (with minor allele frequencies of less than 0.5%) in 4 genes at exome-wide significance, including a series of more than 30 SLC30A8 alleles that conveys protection against T2D, and in 12 gene sets, including those corresponding to T2D drug targets (P = 6.1 × 10-3) and candidate genes from knockout mice (P = 5.2 × 10-3). Within our study, the strongest T2D gene-level signals for rare variants explain at most 25% of the heritability of the strongest common single-variant signals, and the gene-level effect sizes of the rare variants that we observed in established T2D drug targets will require 75,000-185,000 sequenced cases to achieve exome-wide significance. We propose a method to interpret these modest rare-variant associations and to incorporate these associations into future target or gene prioritization efforts.

228 citations


Journal ArticleDOI
TL;DR: In conclusion, rapid genomic sequencing can be performed as a first-tier diagnostic test in inpatient infants and urWGS had the shortest time to result, which was important in unstable infants, and those in whom a genetic diagnosis was likely to impact immediate management.
Abstract: The second Newborn Sequencing in Genomic Medicine and Public Health study was a randomized, controlled trial of the effectiveness of rapid whole-genome or -exome sequencing (rWGS or rWES, respectively) in seriously ill infants with diseases of unknown etiology. Here we report comparisons of analytic and diagnostic performance. Of 1,248 ill inpatient infants, 578 (46%) had diseases of unknown etiology. 213 infants (37% of those eligible) were enrolled within 96 h of admission. 24 infants (11%) were very ill and received ultra-rapid whole-genome sequencing (urWGS). The remaining infants were randomized, 95 to rWES and 94 to rWGS. The analytic performance of rWGS was superior to rWES, including variants likely to affect protein function, and ClinVar pathogenic/likely pathogenic variants (p

212 citations


Journal ArticleDOI
TL;DR: A diagnostic tool based on blood RNA-seq is shown to identify causal genes and variants linked to clinical phenotypes in individuals with rare diseases for which whole-exome genetic sequencing was uninformative.
Abstract: It is estimated that 350 million individuals worldwide suffer from rare diseases, which are predominantly caused by mutation in a single gene1. The current molecular diagnostic rate is estimated at 50%, with whole-exome sequencing (WES) among the most successful approaches2–5. For patients in whom WES is uninformative, RNA sequencing (RNA-seq) has shown diagnostic utility in specific tissues and diseases6–8. This includes muscle biopsies from patients with undiagnosed rare muscle disorders6,9, and cultured fibroblasts from patients with mitochondrial disorders7. However, for many individuals, biopsies are not performed for clinical care, and tissues are difficult to access. We sought to assess the utility of RNA-seq from blood as a diagnostic tool for rare diseases of different pathophysiologies. We generated whole-blood RNA-seq from 94 individuals with undiagnosed rare diseases spanning 16 diverse disease categories. We developed a robust approach to compare data from these individuals with large sets of RNA-seq data for controls (n = 1,594 unrelated controls and n = 49 family members) and demonstrated the impacts of expression, splicing, gene and variant filtering strategies on disease gene identification. Across our cohort, we observed that RNA-seq yields a 7.5% diagnostic rate, and an additional 16.7% with improved candidate gene resolution. A diagnostic tool based on blood RNA-seq is shown to identify causal genes and variants linked to clinical phenotypes in individuals with rare diseases for which whole-exome genetic sequencing was uninformative.

206 citations


Journal ArticleDOI
TL;DR: A convergence in the genetics of severe and less-severe epilepsies associated with ultra-rare coding variation is confirmed, and it highlights a ubiquitous role for GABAergic inhibition in epilepsy etiology.
Abstract: Sequencing-based studies have identified novel risk genes associated with severe epilepsies and revealed an excess of rare deleterious variation in less-severe forms of epilepsy. To identify the shared and distinct ultra-rare genetic risk factors for different types of epilepsies, we performed a whole-exome sequencing (WES) analysis of 9,170 epilepsy-affected individuals and 8,436 controls of European ancestry. We focused on three phenotypic groups: severe developmental and epileptic encephalopathies (DEEs), genetic generalized epilepsy (GGE), and non-acquired focal epilepsy (NAFE). We observed that compared to controls, individuals with any type of epilepsy carried an excess of ultra-rare, deleterious variants in constrained genes and in genes previously associated with epilepsy; we saw the strongest enrichment in individuals with DEEs and the least strong in individuals with NAFE. Moreover, we found that inhibitory GABAA receptor genes were enriched for missense variants across all three classes of epilepsy, whereas no enrichment was seen in excitatory receptor genes. The larger gene groups for the GABAergic pathway or cation channels also showed a significant mutational burden in DEEs and GGE. Although no single gene surpassed exome-wide significance among individuals with GGE or NAFE, highly constrained genes and genes encoding ion channels were among the lead associations; such genes included CACNA1G, EEF1A2, and GABRG2 for GGE and LGI1, TRIM3, and GABRG2 for NAFE. Our study, the largest epilepsy WES study to date, confirms a convergence in the genetics of severe and less-severe epilepsies associated with ultra-rare coding variation, and it highlights a ubiquitous role for GABAergic inhibition in epilepsy etiology.

190 citations


Journal ArticleDOI
TL;DR: Reanalysis of Clinical Exome Data and Diagnostic Yield As knowledge about genetic causes of disease improves, periodic reanalysis of clinical exome sequence could yield new genetic information.
Abstract: Reanalysis of Clinical Exome Data and Diagnostic Yield As knowledge about genetic causes of disease improves, periodic reanalysis of clinical exome sequence could yield new genetic information. Thi...

172 citations


Journal ArticleDOI
Dorota Monies1, M. Abouelhoda1, Mirna Assoum, Nabil Moghrabi1, Rafiullah Rafiullah, Naif A.M. Almontashiri2, Mohammed Al-Owain, Hamad Al-Zaidan, Moeen Al-Sayed, Shazia Subhani1, Edward Cupler, Maha Faden, Amal Alhashem, Alya Qari, Aziza Chedrawi, Hisham Aldhalaan, Wesam Kurdi, Sameena Khan, Zuhair Rahbeeni, Maha Alotaibi, Ewa Goljan1, Hadeel Elbardisy, Mohamed El-Kalioby1, Zeeshan Shah1, Hibah Alruwaili1, Amal Jaafar1, Ranad Albar1, Asma Akilan, Hamsa T. Tayeb1, Asma I. Tahir1, Mohammed Fawzy1, Mohammed Nasr1, Shaza Makki, Abdullah Alfaifi, Hanna Akleh, Suad Al Yamani, Dalal K. Bubshait, Mohammed Mahnashi, Talal A. Basha, Afaf Alsagheir, Musad Abu Khaled, Khalid Alsaleem, Maisoon Almugbel, Manal Badawi, Fahad A. Bashiri3, Saeed Bohlega, Raashida Sulaiman, Ehab Tous, Syed Ahmed, Talal Algoufi, Hamoud Al-Mousa, Emadia Alaki, Susan Alhumaidi4, Hadeel Alghamdi, Malak Alghamdi4, Ahmed Sahly, Shapar Nahrir4, Ali Al-Ahmari5, Hisham Alkuraya, Ali Al-Mehaidib, Mohammed Abanemai, Fahad Alsohaibaini, Bandar Al-Saud, Rand Arnaout, Ghada M H Abdel-Salam, Hasan Al-Dhekri, Suzan A AlKhater6, Khalid S. Alqadi, Essam Al-Sabban, Turki Alshareef, Khalid Awartani, Hanaa Banjar, Nada Alsahan, Ibraheem F. Abosoudah, Abdullah Alashwal, Wajeeh Aldekhail, Sami Al-Hajjar, Sulaiman M. Al-Mayouf, Abdulaziz Alsemari, Walaa Alshuaibi3, Saeed Altala, Abdulhadi Altalhi4, Salah Baz, Muddathir H Hamad3, Tariq Abalkhail, Badi Alenazi, Alya Alkaff, Fahad Almohareb, Fuad Al Mutairi7, Fuad Al Mutairi8, Mona Alsaleh, Abdullah Alsonbul, Somaya Alzelaye, Shakir Bahzad, Abdulaziz Bin Manee, Ola Jarrad, Neama Meriki3, Bassem Albeirouti, Amal Alqasmi4, Mohammed AlBalwi7, Nawal Makhseed, Saeed Hassan3, Isam Salih, Mustafa A. Salih3, Marwan Shaheen, Saadeh Sermin, Shamsad Shahrukh, Shahrukh K. Hashmi, Ayman Shawli8, Ameen Tajuddin, Abdullah Tamim, Ahmed Alnahari, Ibrahim Ghemlas, Maged H. Hussein, Sami Wali, Hatem Murad, Brian F. Meyer1, Fowzan S. Alkuraya1, Fowzan S. Alkuraya5 
TL;DR: The influence of a predominantly autosomal-recessive landscape on the clinical utility of rapid sequencing (Flash Exome) is described and the cohort's genotypic and phenotypic data represent a unique resource that can contribute to improved variant interpretation through data sharing.
Abstract: We report the results of clinical exome sequencing (CES) on >2,200 previously unpublished Saudi families as a first-tier test. The predominance of autosomal-recessive causes allowed us to make several key observations. We highlight 155 genes that we propose to be recessive, disease-related candidates. We report additional mutational events in 64 previously reported candidates (40 recessive), and these events support their candidacy. We report recessive forms of genes that were previously associated only with dominant disorders and that have phenotypes ranging from consistent with to conspicuously distinct from the known dominant phenotypes. We also report homozygous loss-of-function events that can inform the genetics of complex diseases. We were also able to deduce the likely causal variant in most couples who presented after the loss of one or more children, but we lack samples from those children. Although a similar pattern of mostly recessive causes was observed in the prenatal setting, the higher proportion of loss-of-function events in these cases was notable. The allelic series presented by the wealth of recessive variants greatly expanded the phenotypic expression of the respective genes. We also make important observations about dominant disorders; these observations include the pattern of de novo variants, the identification of 74 candidate dominant, disease-related genes, and the potential confirmation of 21 previously reported candidates. Finally, we describe the influence of a predominantly autosomal-recessive landscape on the clinical utility of rapid sequencing (Flash Exome). Our cohort's genotypic and phenotypic data represent a unique resource that can contribute to improved variant interpretation through data sharing.

Journal ArticleDOI
18 Apr 2019-Cell
TL;DR: Pre-malignant somatic alterations are often viewed through the lens of cancer, but it is shown that mutations can promote regeneration, likely independent of carcinogenesis.

Journal ArticleDOI
TL;DR: A genomic landscape analysis of samples collected from three continents reveals a potential role for CDK4/6 or MEK inhibition in the treatment of the disease.
Abstract: Knowledge of key drivers and therapeutic targets in mucosal melanoma is limited due to the paucity of comprehensive mutation data on this rare tumor type. To better understand the genomic landscape of mucosal melanoma, here we describe whole genome sequencing analysis of 67 tumors and validation of driver gene mutations by exome sequencing of 45 tumors. Tumors have a low point mutation burden and high numbers of structural variants, including recurrent structural rearrangements targeting TERT, CDK4 and MDM2. Significantly mutated genes are NRAS, BRAF, NF1, KIT, SF3B1, TP53, SPRED1, ATRX, HLA-A and CHD8. SF3B1 mutations occur more commonly in female genital and anorectal melanomas and CTNNB1 mutations implicate a role for WNT signaling defects in the genesis of some mucosal melanomas. TERT aberrations and ATRX mutations are associated with alterations in telomere length. Mutation profiles of the majority of mucosal melanomas suggest potential susceptibility to CDK4/6 and/or MEK inhibitors.

Journal ArticleDOI
TL;DR: A pilot study for SPARK identified variants in genes and loci that are clinically recognized causes or significant contributors to ASD in 10.4% of families without previous genetic findings, and BRSK2 has the strongest statistical support and reaches genome-wide significance as a risk gene for ASD.
Abstract: Autism spectrum disorder (ASD) is a genetically heterogeneous condition, caused by a combination of rare de novo and inherited variants as well as common variants in at least several hundred genes. However, significantly larger sample sizes are needed to identify the complete set of genetic risk factors. We conducted a pilot study for SPARK (SPARKForAutism.org) of 457 families with ASD, all consented online. Whole exome sequencing (WES) and genotyping data were generated for each family using DNA from saliva. We identified variants in genes and loci that are clinically recognized causes or significant contributors to ASD in 10.4% of families without previous genetic findings. In addition, we identified variants that are possibly associated with ASD in an additional 3.4% of families. A meta-analysis using the TADA framework at a false discovery rate (FDR) of 0.1 provides statistical support for 26 ASD risk genes. While most of these genes are already known ASD risk genes, BRSK2 has the strongest statistical support and reaches genome-wide significance as a risk gene for ASD (p-value = 2.3e−06). Future studies leveraging the thousands of individuals with ASD who have enrolled in SPARK are likely to further clarify the genetic risk factors associated with ASD as well as allow accelerate ASD research that incorporates genetic etiology.

Journal ArticleDOI
TL;DR: Phenotypic annotation of all human genes; development of bioinformatic tools and analytic methods; exploration of non-Mendelian modes of inheritance including reduced penetrance, multilocus variation, and oligogenic inheritance; construction of allelic series at a locus; enhanced data sharing worldwide; and integration with clinical genomics are explored.

Journal ArticleDOI
TL;DR: In this multi-centre study of adults with CKD, a molecular genetic diagnosis was established in over one-third of families and whole exome sequencing (WES) may be an important tool to identify the cause of CKD in adults.

Journal ArticleDOI
TL;DR: Remarkably, antisense oligonucleotides targeting the aberrant splice processes resulted in (partial) correction of all splicing defects, showing the great potential of splice modulation therapy for deep-intronic variants.

Journal ArticleDOI
31 Jul 2019-Nature
TL;DR: Exome-wide sequencing studies of populations in Finland identified 26 deleterious alleles associated with 64 quantitative traits that are clinically relevant to cardiovascular and metabolic diseases, including 19 unique to or more than 20 times more frequent in Finnish individuals than in other Europeans.
Abstract: Exome-sequencing studies have generally been underpowered to identify deleterious alleles with a large effect on complex traits as such alleles are mostly rare. Because the population of northern and eastern Finland has expanded considerably and in isolation following a series of bottlenecks, individuals of these populations have numerous deleterious alleles at a relatively high frequency. Here, using exome sequencing of nearly 20,000 individuals from these regions, we investigate the role of rare coding variants in clinically relevant quantitative cardiometabolic traits. Exome-wide association studies for 64 quantitative traits identified 26 newly associated deleterious alleles. Of these 26 alleles, 19 are either unique to or more than 20 times more frequent in Finnish individuals than in other Europeans and show geographical clustering comparable to Mendelian disease mutations that are characteristic of the Finnish population. We estimate that sequencing studies of populations without this unique history would require hundreds of thousands to millions of participants to achieve comparable association power. Exome-wide sequencing studies of populations in Finland identified 26 deleterious alleles associated with 64 quantitative traits that are clinically relevant to cardiovascular and metabolic diseases.

Journal ArticleDOI
TL;DR: Application of existing biological databases and bioinformatics tools to address the bottleneck in data processing and analysis are presented, including the need for new generation big data analytics for the multi-omics challenges of personalized medicine.
Abstract: There is a growing attention toward personalized medicine. This is led by a fundamental shift from the 'one size fits all' paradigm for treatment of patients with conditions or predisposition to diseases, to one that embraces novel approaches, such as tailored target therapies, to achieve the best possible outcomes. Driven by these, several national and international genome projects have been initiated to reap the benefits of personalized medicine. Exome and targeted sequencing provide a balance between cost and benefit, in contrast to whole genome sequencing (WGS). Whole exome sequencing (WES) targets approximately 3% of the whole genome, which is the basis for protein-coding genes. Nonetheless, it has the characteristics of big data in large deployment. Herein, the application of WES and its relevance in advancing personalized medicine is reviewed. WES is mapped to Big Data "10 Vs" and the resulting challenges discussed. Application of existing biological databases and bioinformatics tools to address the bottleneck in data processing and analysis are presented, including the need for new generation big data analytics for the multi-omics challenges of personalized medicine. This includes the incorporation of artificial intelligence (AI) in the clinical utility landscape of genomic information, and future consideration to create a new frontier toward advancing the field of personalized medicine.

Posted ContentDOI
09 Mar 2019-bioRxiv
TL;DR: The first tranche of large-scale exome sequence data for 49,960 study participants is described, revealing approximately 4 million coding variants and 231,631 predicted loss of function variants, a >10-fold increase compared to imputed sequence for the same participants.
Abstract: SUMMARY The UK Biobank is a prospective study of 502,543 individuals, combining extensive phenotypic and genotypic data with streamlined access for researchers around the world. Here we describe the first tranche of large-scale exome sequence data for 49,960 study participants, revealing approximately 4 million coding variants (of which ~98.4% have frequency 10-fold increase compared to imputed sequence for the same participants. Nearly all genes (>97%) had ≥1 predicted loss of function carrier, and most genes (>69%) had ≥10 loss of function carriers. We illustrate the power of characterizing loss of function variation in this large population through association analyses across 1,741 phenotypes. In addition to replicating a range of established associations, we discover novel loss of function variants with large effects on disease traits, including PIEZO1 on varicose veins, COL6A1 on corneal resistance, MEPE on bone density, and IQGAP2 and GMPR on blood cell traits. We further demonstrate the value of exome sequencing by surveying the prevalence of pathogenic variants of clinical significance in this population, finding that 2% of the population has a medically actionable variant. Additionally, we leverage the phenotypic data to characterize the relationship between rare BRCA1 and BRCA2 pathogenic variants and cancer risk. Exomes from the first 49,960 participants are now made accessible to the scientific community and highlight the promise offered by genomic sequencing in large-scale population-based studies.

Journal ArticleDOI
TL;DR: The number of rare likely deleterious variants in functionally intolerant genes (“other hits”) correlated with expression of neurodevelopmental phenotypes in probands with 16p12.1 deletion and in autism probands carrying gene-disruptive variants (n=184, p=0.03) compared with their carrier family members.

Journal ArticleDOI
TL;DR: The case for the early use of genomic testing in the diagnostic trajectory is strengthened, and laboratory policy on periodic WES data reanalysis can be guided, and the cost and benefits of cascade testing and reproductive service use are guided.

Journal ArticleDOI
TL;DR: Advances in genomic research and risk-directed therapy have led to improvements in the long-term survival and quality of life outcomes of patients with childhood acute lymphoblastic leukaemia, but a growing number of genetic conditions that predispose patients to develop ALL have been identified.
Abstract: Advances in genomic research and risk-directed therapy have led to improvements in the long-term survival and quality of life outcomes of patients with childhood acute lymphoblastic leukaemia (ALL). The application of next-generation sequencing technologies, especially transcriptome sequencing, has resulted in the identification of novel molecular subtypes of ALL with prognostic and therapeutic implications, as well as cooperative mutations that account for much of the heterogeneity in clinical responses observed among patients with specific ALL subtypes. In addition, germline genetic variants have been shown to influence the risk of developing ALL and/or the responses of non-malignant and leukaemia cells to therapy; shared pathways for drug activation and metabolism are implicated in treatment-related toxicity and drug sensitivity or resistance, depending on whether the genetic changes are germline, somatic or both. Indeed, although once considered a non-hereditary disease, genomic investigations of familial and sporadic ALL have revealed a growing number of genetic alterations or conditions that predispose individuals to the development of ALL and treatment-related second cancers. The identification of these genetic alterations holds the potential to direct genetic counselling, testing and possibly monitoring for the early detection of ALL and other cancers. Herein, we review these advances in our understanding of the genomic landscape of childhood ALL and their clinical implications. Herein, advances in our understanding of the genomic landscape of childhood acute lymphoblastic leukaemia (ALL), encompassing both somatic and germline alterations, are reviewed. The clinical implications of these alterations, particularly those in the germ line, are discussed with regard to susceptibility to ALL, treatment responses and therapy-related toxicities.

01 Jan 2019
TL;DR: The authors used exome-sequencing analyses of a large cohort of patients with Type 2 diabetes and control individuals without diabetes from five ancestries to identify gene-level associations of rare variants that are associated with type 2 diabetes.
Abstract: Protein-coding genetic variants that strongly affect disease risk can yield relevant clues to disease pathogenesis. Here we report exome-sequencing analyses of 20,791 individuals with type 2 diabetes (T2D) and 24,440 non-diabetic control participants from 5 ancestries. We identify gene-level associations of rare variants (with minor allele frequencies of less than 0.5%) in 4 genes at exome-wide significance, including a series of more than 30 SLC30A8 alleles that conveys protection against T2D, and in 12 gene sets, including those corresponding to T2D drug targets (P = 6.1 × 10−3) and candidate genes from knockout mice (P = 5.2 × 10−3). Within our study, the strongest T2D gene-level signals for rare variants explain at most 25% of the heritability of the strongest common single-variant signals, and the gene-level effect sizes of the rare variants that we observed in established T2D drug targets will require 75,000–185,000 sequenced cases to achieve exome-wide significance. We propose a method to interpret these modest rare-variant associations and to incorporate these associations into future target or gene prioritization efforts.Exome-sequencing analyses of a large cohort of patients with type 2 diabetes and control individuals without diabetes from five ancestries are used to identify gene-level associations of rare variants that are associated with type 2 diabetes.

Journal ArticleDOI
TL;DR: It is shown that this unbiased method, which does not rely upon specific knowledge on individual genes, is effective in both identifying previously unknown disease gene associations, and flagging genes that have previously been incorrectly implicated in disease.
Abstract: The diagnostic yield of exome and genome sequencing remains low (8-70%), due to incomplete knowledge on the genes that cause disease. To improve this, we use RNA-seq data from 31,499 samples to predict which genes cause specific disease phenotypes, and develop GeneNetwork Assisted Diagnostic Optimization (GADO). We show that this unbiased method, which does not rely upon specific knowledge on individual genes, is effective in both identifying previously unknown disease gene associations, and flagging genes that have previously been incorrectly implicated in disease. GADO can be run on www.genenetwork.nl by supplying HPO-terms and a list of genes that contain candidate variants. Finally, applying GADO to a cohort of 61 patients for whom exome-sequencing analysis had not resulted in a genetic diagnosis, yields likely causative genes for ten cases.

Journal ArticleDOI
TL;DR: Analysis of whole-exome sequencing data from 2,343 individuals with autism spectrum disorder compared to 5,852 unaffected individuals demonstrates an excess of biallelic, autosomal mutations for both loss-of-function and damaging missense variants.
Abstract: Autism spectrum disorder (ASD) affects up to 1 in 59 individuals1. Genome-wide association and large-scale sequencing studies strongly implicate both common variants2-4 and rare de novo variants5-10 in ASD. Recessive mutations have also been implicated11-14 but their contribution remains less well defined. Here we demonstrate an excess of biallelic loss-of-function and damaging missense mutations in a large ASD cohort, corresponding to approximately 5% of total cases, including 10% of females, consistent with a female protective effect. We document biallelic disruption of known or emerging recessive neurodevelopmental genes (CA2, DDHD1, NSUN2, PAH, RARB, ROGDI, SLC1A1, USH2A) as well as other genes not previously implicated in ASD including FEV (FEV transcription factor, ETS family member), which encodes a key regulator of the serotonergic circuitry. Our data refine estimates of the contribution of recessive mutation to ASD and suggest new paths for illuminating previously unknown biological pathways responsible for this condition.

Journal ArticleDOI
TL;DR: The NOTCH1 locus is the most frequent site of genetic variants predisposing to nonsyndromic TOF, followed by FLT4, and variants in these genes are found in almost 7% of TOF patients.
Abstract: Rationale: Familial recurrence studies provide strong evidence for a genetic component to the predisposition to sporadic, nonsyndromic Tetralogy of Fallot (TOF), the most common cyanotic congenital...

Journal ArticleDOI
TL;DR: Exome and transcriptome sequencing is performed on precursor legions and invasive lung adenocarcinomas, identifying recurrently mutated genes in pre/minimally invasive cases, and arm level alteration events linked to immune infiltration.
Abstract: Adenocarcinoma in situ and minimally invasive adenocarcinoma are the pre-invasive forms of lung adenocarcinoma. The genomic and immune profiles of these lesions are poorly understood. Here we report exome and transcriptome sequencing of 98 lung adenocarcinoma precursor lesions and 99 invasive adenocarcinomas. We have identified EGFR, RBM10, BRAF, ERBB2, TP53, KRAS, MAP2K1 and MET as significantly mutated genes in the pre/minimally invasive group. Classes of genome alterations that increase in frequency during the progression to malignancy are revealed. These include mutations in TP53, arm-level copy number alterations, and HLA loss of heterozygosity. Immune infiltration is correlated with copy number alterations of chromosome arm 6p, suggesting a link between arm-level events and the tumor immune environment.

01 Jan 2019
TL;DR: In this paper, the role of rare coding variants in clinically relevant quantitative cardiometabolic traits was investigated using exome-sequencing of nearly 20,000 individuals from these regions.
Abstract: Exome-sequencing studies have generally been underpowered to identify deleterious alleles with a large effect on complex traits as such alleles are mostly rare. Because the population of northern and eastern Finland has expanded considerably and in isolation following a series of bottlenecks, individuals of these populations have numerous deleterious alleles at a relatively high frequency. Here, using exome sequencing of nearly 20,000 individuals from these regions, we investigate the role of rare coding variants in clinically relevant quantitative cardiometabolic traits. Exome-wide association studies for 64 quantitative traits identified 26 newly associated deleterious alleles. Of these 26 alleles, 19 are either unique to or more than 20 times more frequent in Finnish individuals than in other Europeans and show geographical clustering comparable to Mendelian disease mutations that are characteristic of the Finnish population. We estimate that sequencing studies of populations without this unique history would require hundreds of thousands to millions of participants to achieve comparable association power.Exome-wide sequencing studies of populations in Finland identified 26 deleterious alleles associated with 64 quantitative traits that are clinically relevant to cardiovascular and metabolic diseases.

Journal ArticleDOI
01 Jun 2019-Gut
TL;DR: This paper dissected the evolutionary history of colorectal cancer using multiregion sequencing and reconstructed the temporal sequence of events leading to CA-CRC using phylogenetic trees.
Abstract: Objective IBD confers an increased lifetime risk of developing colorectal cancer (CRC), and colitis-associated CRC (CA-CRC) is molecularly distinct from sporadic CRC (S-CRC). Here we have dissected the evolutionary history of CA-CRC using multiregion sequencing. Design Exome sequencing was performed on fresh-frozen multiple regions of carcinoma, adjacent non-cancerous mucosa and blood from 12 patients with CA-CRC (n=55 exomes), and key variants were validated with orthogonal methods. Genome-wide copy number profiling was performed using single nucleotide polymorphism arrays and low-pass whole genome sequencing on archival non-dysplastic mucosa (n=9), low-grade dysplasia (LGD; n=30), high-grade dysplasia (HGD; n=13), mixed LGD/HGD (n=7) and CA-CRC (n=19). Phylogenetic trees were reconstructed, and evolutionary analysis used to reveal the temporal sequence of events leading to CA-CRC. Results 10/12 tumours were microsatellite stable with a median mutation burden of 3.0 single nucleotide alterations (SNA) per Mb, ~20% higher than S-CRC (2.5 SNAs/Mb), and consistent with elevated ageing-associated mutational processes. Non-dysplastic mucosa had considerable mutation burden (median 47 SNAs), including mutations shared with the neighbouring CA-CRC, indicating a precancer mutational field. CA-CRCs were often near triploid (40%) or near tetraploid (20%) and phylogenetic analysis revealed that copy number alterations (CNAs) began to accrue in non-dysplastic bowel, but the LGD/HGD transition often involved a punctuated ‘catastrophic’ CNA increase. Conclusions Evolutionary genomic analysis revealed precancer clones bearing extensive SNAs and CNAs, with progression to cancer involving a dramatic accrual of CNAs at HGD. Detection of the cancerised field is an encouraging prospect for surveillance, but punctuated evolution may limit the window for early detection.