scispace - formally typeset
Search or ask a question

Showing papers in "Genome Medicine in 2021"


Journal ArticleDOI
TL;DR: The authors integrated two specialized splicing scores into CADD (Combined Annotation Dependent Depletion; cadd.gs.washington.edu ), a widely used tool for genome-wide variant effect prediction that was previously developed to weight and integrate diverse collections of genomic annotations.
Abstract: Splicing of genomic exons into mRNAs is a critical prerequisite for the accurate synthesis of human proteins. Genetic variants impacting splicing underlie a substantial proportion of genetic disease, but are challenging to identify beyond those occurring at donor and acceptor dinucleotides. To address this, various methods aim to predict variant effects on splicing. Recently, deep neural networks (DNNs) have been shown to achieve better results in predicting splice variants than other strategies. It has been unclear how best to integrate such process-specific scores into genome-wide variant effect predictors. Here, we use a recently published experimental data set to compare several machine learning methods that score variant effects on splicing. We integrate the best of those approaches into general variant effect prediction models and observe the effect on classification of known pathogenic variants. We integrate two specialized splicing scores into CADD (Combined Annotation Dependent Depletion; cadd.gs.washington.edu ), a widely used tool for genome-wide variant effect prediction that we previously developed to weight and integrate diverse collections of genomic annotations. With this new model, CADD-Splice, we show that inclusion of splicing DNN effect scores substantially improves predictions across multiple variant categories, without compromising overall performance. While splice effect scores show superior performance on splice variants, specialized predictors cannot compete with other variant scores in general variant interpretation, as the latter account for nonsense and missense effects that do not alter splicing. Although only shown here for splice scores, we believe that the applied approach will generalize to other specific molecular processes, providing a path for the further improvement of genome-wide variant effect prediction.

252 citations


Journal ArticleDOI
TL;DR: This article performed RNA-seq of whole blood cell transcriptomes and granulocyte preparations from mild and severe COVID-19 patients and analyzed the data using a combination of conventional and data-driven co-expression analysis.
Abstract: The SARS-CoV-2 pandemic is currently leading to increasing numbers of COVID-19 patients all over the world. Clinical presentations range from asymptomatic, mild respiratory tract infection, to severe cases with acute respiratory distress syndrome, respiratory failure, and death. Reports on a dysregulated immune system in the severe cases call for a better characterization and understanding of the changes in the immune system. In order to dissect COVID-19-driven immune host responses, we performed RNA-seq of whole blood cell transcriptomes and granulocyte preparations from mild and severe COVID-19 patients and analyzed the data using a combination of conventional and data-driven co-expression analysis. Additionally, publicly available data was used to show the distinction from COVID-19 to other diseases. Reverse drug target prediction was used to identify known or novel drug candidates based on finding from data-driven findings. Here, we profiled whole blood transcriptomes of 39 COVID-19 patients and 10 control donors enabling a data-driven stratification based on molecular phenotype. Neutrophil activation-associated signatures were prominently enriched in severe patient groups, which was corroborated in whole blood transcriptomes from an independent second cohort of 30 as well as in granulocyte samples from a third cohort of 16 COVID-19 patients (44 samples). Comparison of COVID-19 blood transcriptomes with those of a collection of over 3100 samples derived from 12 different viral infections, inflammatory diseases, and independent control samples revealed highly specific transcriptome signatures for COVID-19. Further, stratified transcriptomes predicted patient subgroup-specific drug candidates targeting the dysregulated systemic immune response of the host. Our study provides novel insights in the distinct molecular subgroups or phenotypes that are not simply explained by clinical parameters. We show that whole blood transcriptomes are extremely informative for COVID-19 since they capture granulocytes which are major drivers of disease severity.

159 citations


Journal ArticleDOI
TL;DR: Deep learning is a subdiscipline of artificial intelligence that uses a machine learning technique called artificial neural networks to extract patterns and make predictions from large data sets as mentioned in this paper, which can be used to diagnose, prognosis and treatment management.
Abstract: Deep learning is a subdiscipline of artificial intelligence that uses a machine learning technique called artificial neural networks to extract patterns and make predictions from large data sets. The increasing adoption of deep learning across healthcare domains together with the availability of highly characterised cancer datasets has accelerated research into the utility of deep learning in the analysis of the complex biology of cancer. While early results are promising, this is a rapidly evolving field with new knowledge emerging in both cancer biology and deep learning. In this review, we provide an overview of emerging deep learning techniques and how they are being applied to oncology. We focus on the deep learning applications for omics data types, including genomic, methylation and transcriptomic data, as well as histopathology-based genomic inference, and provide perspectives on how the different data types can be integrated to develop decision support tools. We provide specific examples of how deep learning may be applied in cancer diagnosis, prognosis and treatment management. We also assess the current limitations and challenges for the application of deep learning in precision oncology, including the lack of phenotypically rich data and the need for more explainable deep learning models. Finally, we conclude with a discussion of how current obstacles can be overcome to enable future clinical utilisation of deep learning.

149 citations


Journal ArticleDOI
TL;DR: In this paper, the authors developed a meta-analysis and integration pipeline that models and removes the effects of technology, tissue of origin, and donor that confound cell-type identification.
Abstract: Immunosuppressive and anti-cytokine treatment may have a protective effect for patients with COVID-19. Understanding the immune cell states shared between COVID-19 and other inflammatory diseases with established therapies may help nominate immunomodulatory therapies. To identify cellular phenotypes that may be shared across tissues affected by disparate inflammatory diseases, we developed a meta-analysis and integration pipeline that models and removes the effects of technology, tissue of origin, and donor that confound cell-type identification. Using this approach, we integrated > 300,000 single-cell transcriptomic profiles from COVID-19-affected lungs and tissues from healthy subjects and patients with five inflammatory diseases: rheumatoid arthritis (RA), Crohn’s disease (CD), ulcerative colitis (UC), systemic lupus erythematosus (SLE), and interstitial lung disease. We tested the association of shared immune states with severe/inflamed status compared to healthy control using mixed-effects modeling. To define environmental factors within these tissues that shape shared macrophage phenotypes, we stimulated human blood-derived macrophages with defined combinations of inflammatory factors, emphasizing in particular antiviral interferons IFN-beta (IFN-β) and IFN-gamma (IFN-γ), and pro-inflammatory cytokines such as TNF. We built an immune cell reference consisting of > 300,000 single-cell profiles from 125 healthy or disease-affected donors from COVID-19 and five inflammatory diseases. We observed a CXCL10+ CCL2+ inflammatory macrophage state that is shared and strikingly abundant in severe COVID-19 bronchoalveolar lavage samples, inflamed RA synovium, inflamed CD ileum, and UC colon. These cells exhibited a distinct arrangement of pro-inflammatory and interferon response genes, including elevated levels of CXCL10, CXCL9, CCL2, CCL3, GBP1, STAT1, and IL1B. Further, we found this macrophage phenotype is induced upon co-stimulation by IFN-γ and TNF-α. Our integrative analysis identified immune cell states shared across inflamed tissues affected by inflammatory diseases and COVID-19. Our study supports a key role for IFN-γ together with TNF-α in driving an abundant inflammatory macrophage phenotype in severe COVID-19-affected lungs, as well as inflamed RA synovium, CD ileum, and UC colon, which may be targeted by existing immunomodulatory therapies.

97 citations


Journal ArticleDOI
TL;DR: CoronaHiT as discussed by the authors uses transposase-based library preparation of ARTIC PCR products for sequencing SARS-CoV-2 genomes during the pandemic and achieves high genome coverage.
Abstract: We present CoronaHiT, a platform and throughput flexible method for sequencing SARS-CoV-2 genomes (≤ 96 on MinION or > 96 on Illumina NextSeq) depending on changing requirements experienced during the pandemic CoronaHiT uses transposase-based library preparation of ARTIC PCR products Method performance was demonstrated by sequencing 2 plates containing 95 and 59 SARS-CoV-2 genomes on nanopore and Illumina platforms and comparing to the ARTIC LoCost nanopore method Of the 154 samples sequenced using all 3 methods, ≥ 90% genome coverage was obtained for 643% using ARTIC LoCost, 714% using CoronaHiT-ONT and 766% using CoronaHiT-Illumina, with almost identical clustering on a maximum likelihood tree This protocol will aid the rapid expansion of SARS-CoV-2 genome sequencing globally

78 citations


Journal ArticleDOI
TL;DR: The findings here illustrate the intra-host bottlenecks and evolutionary dynamics of SARS-CoV-2 in different anatomic sites and may provide new insights to understand the virus-host interactions of coronaviruses and other RNA viruses.
Abstract: Since early February 2021, the causative agent of COVID-19, SARS-CoV-2, has infected over 104 million people with more than 2 million deaths according to official reports. The key to understanding the biology and virus-host interactions of SARS-CoV-2 requires the knowledge of mutation and evolution of this virus at both inter- and intra-host levels. However, despite quite a few polymorphic sites identified among SARS-CoV-2 populations, intra-host variant spectra and their evolutionary dynamics remain mostly unknown. Using high-throughput sequencing of metatranscriptomic and hybrid captured libraries, we characterized consensus genomes and intra-host single nucleotide variations (iSNVs) of serial samples collected from eight patients with COVID-19. The distribution of iSNVs along the SARS-CoV-2 genome was analyzed and co-occurring iSNVs among COVID-19 patients were identified. We also compared the evolutionary dynamics of SARS-CoV-2 population in the respiratory tract (RT) and gastrointestinal tract (GIT). The 32 consensus genomes revealed the co-existence of different genotypes within the same patient. We further identified 40 intra-host single nucleotide variants (iSNVs). Most (30/40) iSNVs presented in a single patient, while ten iSNVs were found in at least two patients or identical to consensus variants. Comparing allele frequencies of the iSNVs revealed a clear genetic differentiation between intra-host populations from the respiratory tract (RT) and gastrointestinal tract (GIT), mostly driven by bottleneck events during intra-host migrations. Compared to RT populations, the GIT populations showed a better maintenance and rapid development of viral genetic diversity following the suspected intra-host bottlenecks. Our findings here illustrate the intra-host bottlenecks and evolutionary dynamics of SARS-CoV-2 in different anatomic sites and may provide new insights to understand the virus-host interactions of coronaviruses and other RNA viruses.

75 citations


Journal ArticleDOI
Henrik Stranneheim1, Henrik Stranneheim2, Henrik Stranneheim3, Kristina Lagerstedt-Robinson3, Kristina Lagerstedt-Robinson1, Måns Magnusson1, Måns Magnusson2, Malin Kvarnung3, Malin Kvarnung1, Daniel Nilsson3, Daniel Nilsson1, Nicole Lesko1, Nicole Lesko3, Martin Engvall1, Martin Engvall3, Britt-Marie Anderlid1, Britt-Marie Anderlid3, Henrik Arnell1, Carolina Backman Johansson3, Michela Barbaro3, Erik Björck1, Erik Björck3, Helene Bruhn3, Helene Bruhn1, Jesper Eisfeldt1, Jesper Eisfeldt3, Christoph Freyer1, Christoph Freyer3, Giedre Grigelioniene1, Giedre Grigelioniene3, Peter Gustavsson1, Peter Gustavsson3, Anna Hammarsjö1, Anna Hammarsjö3, Maritta Hellström-Pigg3, Maritta Hellström-Pigg1, Erik Iwarsson1, Erik Iwarsson3, Anders Jemt1, Mikael Laaksonen2, Sara Lind Enoksson3, Helena Malmgren3, Helena Malmgren1, Karin Naess3, Magnus Nordenskjöld1, Magnus Nordenskjöld3, Mikael Oscarson3, Maria Pettersson1, Maria Pettersson3, Chiara Rasi2, Adam Rosenbaum2, Ellika Sahlin3, Ellika Sahlin1, Eliane Sardh3, Eliane Sardh1, Tommy Stödberg1, Tommy Stödberg3, Bianca Tesi3, Bianca Tesi1, Emma Tham1, Emma Tham3, Håkan Thonberg3, Håkan Thonberg1, Virpi Töhönen1, Ulrika von Döbeln3, Daphne Vassiliou3, Daphne Vassiliou1, Sofie Vonlanthen3, Ann-Charlotte Wikström3, Josephine Wincent3, Josephine Wincent1, Ola Winqvist3, Anna Wredenberg3, Anna Wredenberg1, Sofia Ygberg3, Sofia Ygberg1, Rolf Zetterström3, Rolf Zetterström1, Per Marits3, Maria Johansson Soller3, Maria Johansson Soller1, Ann Nordgren3, Ann Nordgren1, Valtteri Wirta2, Anna Lindstrand3, Anna Lindstrand1, Anna Wedell1, Anna Wedell2, Anna Wedell3 
TL;DR: In this article, the authors report the findings from 4437 individuals (3219 patients and 1218 relatives) who have been analyzed by whole genome sequencing (WGS) at the Genomic Medicine Center Karolinska-Rare Diseases (GMCK-RD) since mid-2015.
Abstract: We report the findings from 4437 individuals (3219 patients and 1218 relatives) who have been analyzed by whole genome sequencing (WGS) at the Genomic Medicine Center Karolinska-Rare Diseases (GMCK-RD) since mid-2015. GMCK-RD represents a long-term collaborative initiative between Karolinska University Hospital and Science for Life Laboratory to establish advanced, genomics-based diagnostics in the Stockholm healthcare setting. Our analysis covers detection and interpretation of SNVs, INDELs, uniparental disomy, CNVs, balanced structural variants, and short tandem repeat expansions. Visualization of results for clinical interpretation is carried out in Scout—a custom-developed decision support system. Results from both singleton (84%) and trio/family (16%) analyses are reported. Variant interpretation is done by 15 expert teams at the hospital involving staff from three clinics. For patients with complex phenotypes, data is shared between the teams. Overall, 40% of the patients received a molecular diagnosis ranging from 19 to 54% for specific disease groups. There was heterogeneity regarding causative genes (n = 754) with some of the most common ones being COL2A1 (n = 12; skeletal dysplasia), SCN1A (n = 8; epilepsy), and TNFRSF13B (n = 4; inborn errors of immunity). Some causative variants were recurrent, including previously known founder mutations, some novel mutations, and recurrent de novo mutations. Overall, GMCK-RD has resulted in a large number of patients receiving specific molecular diagnoses. Furthermore, negative cases have been included in research studies that have resulted in the discovery of 17 published, novel disease-causing genes. To facilitate the discovery of new disease genes, GMCK-RD has joined international data sharing initiatives, including ClinVar, UDNI, Beacon, and MatchMaker Exchange. Clinical WGS at GMCK-RD has provided molecular diagnoses to over 1200 individuals with a broad range of rare diseases. Consolidation and spread of this clinical-academic partnership will enable large-scale national collaboration.

72 citations


Journal ArticleDOI
TL;DR: In this article, the authors used 16S rRNA gene sequencing to identify taxonomic groups associated with prognostic clinicopathologic features in breast cancer, and then used network analysis based on Spearman coefficients to correlate specific bacterial taxa with immunological data from NanoString gene expression and 65-plex cytokine assays.
Abstract: Currently, over half of breast cancer cases are unrelated to known risk factors, highlighting the importance of discovering other cancer-promoting factors. Since crosstalk between gut microbes and host immunity contributes to many diseases, we hypothesized that similar interactions could occur between the recently described breast microbiome and local immune responses to influence breast cancer pathogenesis. Using 16S rRNA gene sequencing, we characterized the microbiome of human breast tissue in a total of 221 patients with breast cancer, 18 individuals predisposed to breast cancer, and 69 controls. We performed bioinformatic analyses using a DADA2-based pipeline and applied linear models with White’s t or Kruskal–Wallis H-tests with Benjamini–Hochberg multiple testing correction to identify taxonomic groups associated with prognostic clinicopathologic features. We then used network analysis based on Spearman coefficients to correlate specific bacterial taxa with immunological data from NanoString gene expression and 65-plex cytokine assays. Multiple bacterial genera exhibited significant differences in relative abundance when stratifying by breast tissue type (tumor, tumor adjacent normal, high-risk, healthy control), cancer stage, grade, histologic subtype, receptor status, lymphovascular invasion, or node-positive status, even after adjusting for confounding variables. Microbiome–immune networks within the breast tended to be bacteria-centric, with sparse structure in tumors and more interconnected structure in benign tissues. Notably, Anaerococcus, Caulobacter, and Streptococcus, which were major bacterial hubs in benign tissue networks, were absent from cancer-associated tissue networks. In addition, Propionibacterium and Staphylococcus, which were depleted in tumors, showed negative associations with oncogenic immune features; Streptococcus and Propionibacterium also correlated positively with T-cell activation-related genes. This study, the largest to date comparing healthy versus cancer-associated breast microbiomes using fresh-frozen surgical specimens and immune correlates, provides insight into microbial profiles that correspond with prognostic clinicopathologic features in breast cancer. It additionally presents evidence for local microbial–immune interplay in breast cancer that merits further investigation and has preventative, diagnostic, and therapeutic potential.

64 citations


Journal ArticleDOI
TL;DR: In this paper, the authors use the concept of "guilds" to identify potentially important functional members associated with specific health outcomes and disease phenotypes for identifying candidate gut bacteria that may causatively contribute to human health.
Abstract: To demonstrate the causative role of gut microbiome in human health and diseases, we first need to identify, via next-generation sequencing, potentially important functional members associated with specific health outcomes and disease phenotypes. However, due to the strain-level genetic complexity of the gut microbiota, microbiome datasets are highly dimensional and highly sparse in nature, making it challenging to identify putative causative agents of a particular disease phenotype. Members of an ecosystem seldomly live independently from each other. Instead, they develop local interactions and form inter-member organizations to influence the ecosystem’s higher-level patterns and functions. In the ecological study of macro-organisms, members are defined as belonging to the same “guild” if they exploit the same class of resources in a similar way or work together as a coherent functional group. Translating the concept of “guild” to the study of gut microbiota, we redefine guild as a group of bacteria that show consistent co-abundant behavior and likely to work together to contribute to the same ecological function. In this opinion article, we discuss how to use guilds as the aggregation unit to reduce dimensionality and sparsity in microbiome-wide association studies for identifying candidate gut bacteria that may causatively contribute to human health and diseases.

62 citations


Journal ArticleDOI
TL;DR: In this paper, the authors argue that despite the parallels to the monogenic setting, new work is urgently needed to gather data, consider normative implications, and develop best practices around this emerging branch of genomics.
Abstract: Clinical use of polygenic risk scores (PRS) will look very different to the more familiar monogenic testing. Here we argue that despite these differences, most of the ethical, legal, and social issues (ELSI) raised in the monogenic setting, such as the relevance of results to family members, the approach to secondary and incidental findings, and the role of expert mediators, continue to be relevant in the polygenic context, albeit in modified form. In addition, PRS will reanimate other old debates. Their use has been proposed both in the practice of clinical medicine and of public health, two contexts with differing norms. In each of these domains, it is unclear what endpoints clinical use of PRS should aim to maximize and under what constraints. Reducing health disparities is a key value for public health, but clinical use of PRS could exacerbate race-based health disparities owing to differences in predictive power across ancestry groups. Finally, PRS will force a reckoning with pre-existing questions concerning biomarkers, namely the relevance of self-reported race, ethnicity and ancestry, and the relationship of risk factors to disease diagnoses. In this Opinion, we argue that despite the parallels to the monogenic setting, new work is urgently needed to gather data, consider normative implications, and develop best practices around this emerging branch of genomics.

60 citations


Journal ArticleDOI
TL;DR: DeepProg as mentioned in this paper is a novel ensemble framework of deep-learning and machine-learning approaches that robustly predicts patient survival subtypes using multi-omics data and yields significantly better risk-stratification than other multomics integration methods.
Abstract: Multi-omics data are good resources for prognosis and survival prediction; however, these are difficult to integrate computationally. We introduce DeepProg, a novel ensemble framework of deep-learning and machine-learning approaches that robustly predicts patient survival subtypes using multi-omics data. It identifies two optimal survival subtypes in most cancers and yields significantly better risk-stratification than other multi-omics integration methods. DeepProg is highly predictive, exemplified by two liver cancer (C-index 0.73–0.80) and five breast cancer datasets (C-index 0.68–0.73). Pan-cancer analysis associates common genomic signatures in poor survival subtypes with extracellular matrix modeling, immune deregulation, and mitosis processes. DeepProg is freely available at https://github.com/lanagarmire/DeepProg

Journal ArticleDOI
TL;DR: In this article, the authors examined three brain regions (dorsolateral prefrontal cortex, medulla oblongata, and choroid plexus) from 5 patients with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and 4 controls.
Abstract: Background Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, has been associated with neurological and neuropsychiatric illness in many individuals. We sought to further our understanding of the relationship between brain tropism, neuro-inflammation, and host immune response in acute COVID-19 cases. Methods Three brain regions (dorsolateral prefrontal cortex, medulla oblongata, and choroid plexus) from 5 patients with severe COVID-19 and 4 controls were examined. The presence of the virus was assessed by western blot against viral spike protein, as well as viral transcriptome analysis covering > 99% of SARS-CoV-2 genome and all potential serotypes. Droplet-based single-nucleus RNA sequencing (snRNA-seq) was performed in the same samples to examine the impact of COVID-19 on transcription in individual cells of the brain. Results Quantification of viral spike S1 protein and viral transcripts did not detect SARS-CoV-2 in the postmortem brain tissue. However, analysis of 68,557 single-nucleus transcriptomes from three distinct regions of the brain identified an increased proportion of stromal cells, monocytes, and macrophages in the choroid plexus of COVID-19 patients. Furthermore, differential gene expression, pseudo-temporal trajectory, and gene regulatory network analyses revealed transcriptional changes in the cortical microglia associated with a range of biological processes, including cellular activation, mobility, and phagocytosis. Conclusions Despite the absence of detectable SARS-CoV-2 in the brain at the time of death, the findings suggest significant and persistent neuroinflammation in patients with acute COVID-19.

Journal ArticleDOI
TL;DR: The role of microglial PGC-1α in post-stroke immune modulation remains unknown as discussed by the authors, however, it is shown that microglia-specific PGC1α overexpressing mice exhibited significantly decreased neurologic deficits after acute ischemic injury, with reduced NLRP3 activation and proinflammatory cytokine production.
Abstract: Neuroinflammation and immune responses occurring minutes to hours after stroke are associated with brain injury after acute ischemic stroke (AIS). PPARγ coactivator-1α (PGC-1α), as a master coregulator of gene expression in mitochondrial biogenesis, was found to be transiently upregulated in microglia after AIS. However, the role of microglial PGC-1α in poststroke immune modulation remains unknown. PGC-1α expression in microglia from human and mouse brain samples following ischemic stroke was first determined. Subsequently, we employed transgenic mice with microglia-specific overexpression of PGC-1α for middle cerebral artery occlusion (MCAO). The morphology and gene expression profile of microglia with PGC-1α overexpression were evaluated. Downstream inflammatory cytokine production and NLRP3 activation were also determined. ChIP-Seq analysis was performed to detect PGC-1α-binding sites in microglia. Autophagic and mitophagic activity was further monitored by immunofluorescence staining. Unc-51-like autophagy activating kinase 1 (ULK1) expression was evaluated under the PGC-1α interaction with ERRα. Finally, pharmacological inhibition and genomic knockdown of ULK1 were performed to estimate the role of ULK1 in mediating mitophagic activity after ischemic stroke. PGC-1α expression was shortly increased after ischemic stroke, not only in human brain samples but also in mouse brain samples. Microglia-specific PGC-1α overexpressing mice exhibited significantly decreased neurologic deficits after ischemic injury, with reduced NLRP3 activation and proinflammatory cytokine production. ChIP-Seq analysis and KEGG pathway analysis revealed that mitophagy was significantly enhanced. PGC-1α significantly promoted autophagic flux and induced autolysosome formation. More specifically, the autophagic clearance of mitochondria was enhanced by PGC-1α regulation, indicating the important role of mitophagy. Pharmacological inhibition or knockdown of ULK1 expression impaired autophagic/mitophagic activity, thus abolishing the neuroprotective effects of PGC-1α. Mechanistically, in AIS, PGC-1α promotes autophagy and mitophagy through ULK1 and reduces NLRP3 activation. Our findings indicate that microglial PGC-1α may be a promising therapeutic target for AIS.

Journal ArticleDOI
TL;DR: In this article, the authors found that the intestinal microbiome can modulate responses to immune checkpoint inhibitors via the host immune system and that the use of antibiotics can lead to reduced efficacy of ICIs.
Abstract: Immune checkpoint inhibitors (ICIs) are monoclonal antibodies that block immune inhibitory pathways. Administration of ICIs augments T cell-mediated immune responses against tumor, resulting in improved overall survival in cancer patients. It has emerged that the intestinal microbiome can modulate responses to ICIs via the host immune system and that the use of antibiotics can lead to reduced efficacy of ICIs. Recently, reports that fecal microbiota transplantation can lead to ICI therapy responses in patients previously refractory to therapy suggest that targeting the microbiome may be a viable strategy to reprogram the tumor microenvironment and augment ICI therapy. Intestinal microbial metabolites may also be linked to response rates to ICIs. In addition to response rates, certain toxicities that can arise during ICI therapy have also been found to be associated with the intestinal microbiome, including in particular colitis. A key mechanistic question is how certain microbes can enhance anti-tumor responses or, alternatively, predispose to ICI-associated colitis. Evidence has emerged that the intestinal microbiome can modulate outcomes to ICI therapies via two major mechanisms, including those that are antigen-specific and those that are antigen-independent. Antigen-specific mechanisms occur when epitopes are shared between microbial and tumor antigens that could enhance, or, alternatively, reduce anti-tumor immune responses via cross-reactive adaptive immune cells. Antigen-independent mechanisms include modulation of responses to ICIs by engaging innate and/or adaptive immune cells. To establish microbiome-based biomarkers of outcomes and specifically modulate the intestinal microbiome to enhance efficacy of ICIs in cancer immunotherapy, further prospective interventional studies will be required.

Journal ArticleDOI
TL;DR: In this paper, the authors present a public health-focussed scheme for genomic epidemiology of N. gonorrhoeae at Pathogenwatch ( https://pathogenwatch.watch/ngonorhoeae ), which provides customised bioinformatic pipelines guided by expert opinion.
Abstract: Antimicrobial-resistant (AMR) Neisseria gonorrhoeae is an urgent threat to public health, as strains resistant to at least one of the two last-line antibiotics used in empiric therapy of gonorrhoea, ceftriaxone and azithromycin, have spread internationally. Whole genome sequencing (WGS) data can be used to identify new AMR clones and transmission networks and inform the development of point-of-care tests for antimicrobial susceptibility, novel antimicrobials and vaccines. Community-driven tools that provide an easy access to and analysis of genomic and epidemiological data is the way forward for public health surveillance. Here we present a public health-focussed scheme for genomic epidemiology of N. gonorrhoeae at Pathogenwatch ( https://pathogen.watch/ngonorrhoeae ). An international advisory group of experts in epidemiology, public health, genetics and genomics of N. gonorrhoeae was convened to inform on the utility of current and future analytics in the platform. We implement backwards compatibility with MLST, NG-MAST and NG-STAR typing schemes as well as an exhaustive library of genetic AMR determinants linked to a genotypic prediction of resistance to eight antibiotics. A collection of over 12,000 N. gonorrhoeae genome sequences from public archives has been quality-checked, assembled and made public together with available metadata for contextualization. AMR prediction from genome data revealed specificity values over 99% for azithromycin, ciprofloxacin and ceftriaxone and sensitivity values around 99% for benzylpenicillin and tetracycline. A case study using the Pathogenwatch collection of N. gonorrhoeae public genomes showed the global expansion of an azithromycin-resistant lineage carrying a mosaic mtr over at least the last 10 years, emphasising the power of Pathogenwatch to explore and evaluate genomic epidemiology questions of public health concern. The N. gonorrhoeae scheme in Pathogenwatch provides customised bioinformatic pipelines guided by expert opinion that can be adapted to public health agencies and departments with little expertise in bioinformatics and lower-resourced settings with internet connection but limited computational infrastructure. The advisory group will assess and identify ongoing public health needs in the field of gonorrhoea, particularly regarding gonococcal AMR, in order to further enhance utility with modified or new analytic methods.

Journal ArticleDOI
TL;DR: In this paper, the authors discuss the intricacies and current methodologies of diet-microbial relations, the implications and limitations of these investigative approaches, and future considerations that may assist in accelerating applications.
Abstract: Personalised dietary modulation of the gut microbiota may be key to disease management. Current investigations provide a broad understanding of the impact of diet on the composition and activity of the gut microbiota, yet detailed knowledge in applying diet as an actionable tool remains limited. Further to the relative novelty of the field, approaches are yet to be standardised and extremely heterogeneous research outcomes have ensued. This may be related to confounders associated with complexities in capturing an accurate representation of both diet and the gut microbiota. This review discusses the intricacies and current methodologies of diet-microbial relations, the implications and limitations of these investigative approaches, and future considerations that may assist in accelerating applications. New investigations should consider improved collection of dietary data, further characterisation of mechanistic interactions, and an increased focus on -omic technologies such as metabolomics to describe the bacterial and metabolic activity of food degradation, together with its crosstalk with the host. Furthermore, clinical evidence with health outcomes is required before therapeutic dietary strategies for microbial amelioration can be made. The potential to reach detailed understanding of diet-microbiota relations may depend on re-evaluation, progression, and unification of research methodologies, which consider the complexities of these interactions.

Journal ArticleDOI
TL;DR: The Bayesian Analysis of Gene Essentiality (BAGEL) algorithm as mentioned in this paper was proposed for accurate classification of gene essentiality from short hairpin RNA and CRISPR/Cas9 genome-wide genetic screens.
Abstract: Identifying essential genes in genome-wide loss-of-function screens is a critical step in functional genomics and cancer target finding. We previously described the Bayesian Analysis of Gene Essentiality (BAGEL) algorithm for accurate classification of gene essentiality from short hairpin RNA and CRISPR/Cas9 genome-wide genetic screens. We introduce an updated version, BAGEL2, which employs an improved model that offers a greater dynamic range of Bayes Factors, enabling detection of tumor suppressor genes; a multi-target correction that reduces false positives from off-target CRISPR guide RNA; and the implementation of a cross-validation strategy that improves performance ~ 10× over the prior bootstrap resampling approach. We also describe a metric for screen quality at the replicate level and demonstrate how different algorithms handle lower quality data in substantially different ways. BAGEL2 substantially improves the sensitivity, specificity, and performance over BAGEL and establishes the new state of the art in the analysis of CRISPR knockout fitness screens. BAGEL2 is written in Python 3 and source code, along with all supporting files, are available on github ( https://github.com/hart-lab/bagel ).

Journal ArticleDOI
TL;DR: In this article, the authors performed scRNA-seq of 18,403 cells unbiasedly collected from 7 treatment-naive serous tubo-ovarian cancer (HGSTOC) patients.
Abstract: High-grade serous tubo-ovarian cancer (HGSTOC) is characterised by extensive inter- and intratumour heterogeneity, resulting in persistent therapeutic resistance and poor disease outcome. Molecular subtype classification based on bulk RNA sequencing facilitates a more accurate characterisation of this heterogeneity, but the lack of strong prognostic or predictive correlations with these subtypes currently hinders their clinical implementation. Stromal admixture profoundly affects the prognostic impact of the molecular subtypes, but the contribution of stromal cells to each subtype has poorly been characterised. Increasing the transcriptomic resolution of the molecular subtypes based on single-cell RNA sequencing (scRNA-seq) may provide insights in the prognostic and predictive relevance of these subtypes. We performed scRNA-seq of 18,403 cells unbiasedly collected from 7 treatment-naive HGSTOC tumours. For each phenotypic cluster of tumour or stromal cells, we identified specific transcriptomic markers. We explored which phenotypic clusters correlated with overall survival based on expression of these transcriptomic markers in microarray data of 1467 tumours. By evaluating molecular subtype signatures in single cells, we assessed to what extent a phenotypic cluster of tumour or stromal cells contributes to each molecular subtype. We identified 11 cancer and 32 stromal cell phenotypes in HGSTOC tumours. Of these, the relative frequency of myofibroblasts, TGF-β-driven cancer-associated fibroblasts, mesothelial cells and lymphatic endothelial cells predicted poor outcome, while plasma cells correlated with more favourable outcome. Moreover, we identified a clear cell-like transcriptomic signature in cancer cells, which correlated with worse overall survival in HGSTOC patients. Stromal cell phenotypes differed substantially between molecular subtypes. For instance, the mesenchymal, immunoreactive and differentiated signatures were characterised by specific fibroblast, immune cell and myofibroblast/mesothelial cell phenotypes, respectively. Cell phenotypes correlating with poor outcome were enriched in molecular subtypes associated with poor outcome. We used scRNA-seq to identify stromal cell phenotypes predicting overall survival in HGSTOC patients. These stromal features explain the association of the molecular subtypes with outcome but also the latter’s weakness of clinical implementation. Stratifying patients based on marker genes specific for these phenotypes represents a promising approach to predict prognosis or response to therapy.

Journal ArticleDOI
TL;DR: In this article, the expression profiles of tRNA-derived small RNAs (tDRs) in colorectal cancer (CRC) plasma and their potential diagnostic values have been systematically explored by small RNA sequencing.
Abstract: tRNA-derived small RNAs (tDRs), which are widely distributed in human tissues including blood and urine, play an important role in the progression of cancer. However, the expression of tDRs in colorectal cancer (CRC) plasma and their potential diagnostic values have not been systematically explored. The expression profiles of tDRs in plasma of CRC and health controls (HCs) are investigated by small RNA sequencing. The level and diagnostic value of 5′-tRF-GlyGCC are evaluated by quantitative PCR in plasma samples from 105 CRC patients and 90 HCs. The mechanisms responsible for biogenesis of 5′-tRF-GlyGCC are checked by in vitro and in vivo models. 5′-tRF-GlyGCC is dramatically increased in plasma of CRC patients compared to that of HCs. The area under curve (AUC) for 5′-tRF-GlyGCC in CRC group is 0.882. The combination of carcinoembryonic antigen (CEA) and carbohydrate antigen 199 (CA199) with 5′-tRF-GlyGCC improves the AUC to 0.926. Consistently, the expression levels of 5′-tRF-GlyGCC in CRC cells and xenograft tissues are significantly greater than that in their corresponding controls. Blood cells co-cultured with CRC cells or mice xenografted with CRC tumors show increased levels of 5′-tRF-GlyGCC. In addition, we find that the increased expression of 5′-tRF-GlyGCC is dependent on the upregulation of AlkB homolog 3 (ALKBH3), a tRNA demethylase which can promote tRNA cleaving to generate tDRs. The level of 5′-tRF-GlyGCC in plasma is a promising diagnostic biomarker for CRC diagnosis.

Journal ArticleDOI
TL;DR: In this article, the effects of RBX2660, a microbiota-based investigational therapeutic, on the composition and abundance of the gut microbiota and resistome, as well as multidrug-resistant organism carriage, after delivery to patients suffering from recurrent Clostridioides difficile infection (CDI) recurrences.
Abstract: Once antibiotic-resistant bacteria become established within the gut microbiota, they can cause infections in the host and be transmitted to other people and the environment. Currently, there are no effective modalities for decreasing or preventing colonization by antibiotic-resistant bacteria. Intestinal microbiota restoration can prevent Clostridioides difficile infection (CDI) recurrences. Another potential application of microbiota restoration is suppression of non-C. difficile multidrug-resistant bacteria and overall decrease in the abundance of antibiotic resistance genes (the resistome) within the gut microbiota. This study characterizes the effects of RBX2660, a microbiota-based investigational therapeutic, on the composition and abundance of the gut microbiota and resistome, as well as multidrug-resistant organism carriage, after delivery to patients suffering from recurrent CDI. An open-label, multi-center clinical trial in 11 centers in the USA for the safety and efficacy of RBX2660 on recurrent CDI was conducted. Fecal specimens from 29 of these subjects with recurrent CDI who received either one (N = 16) or two doses of RBX2660 (N = 13) were analyzed secondarily. Stool samples were collected prior to and at intervals up to 6 months post-therapy and analyzed in three ways: (1) 16S rRNA gene sequencing for microbiota taxonomic composition, (2) whole metagenome shotgun sequencing for functional pathways and antibiotic resistome content, and (3) selective and differential bacterial culturing followed by isolate genome sequencing to longitudinally track multidrug-resistant organisms. Successful prevention of CDI recurrence with RBX2660 correlated with taxonomic convergence of patient microbiota to the donor microbiota as measured by weighted UniFrac distance. RBX2660 dramatically reduced the abundance of antibiotic-resistant Enterobacteriaceae in the 2 months after administration. Fecal antibiotic resistance gene carriage decreased in direct relationship to the degree to which donor microbiota engrafted. Microbiota-based therapeutics reduce resistance gene abundance and resistant organisms in the recipient gut microbiome. This approach could potentially reduce the risk of infections caused by resistant organisms within the patient and the transfer of resistance genes or pathogens to others. ClinicalTrials.gov, NCT01925417 ; registered on August 19, 2013.

Journal ArticleDOI
TL;DR: For instance, the authors found that a higher intake of dietary fiber is associated with a decreased risk of chronic inflammatory diseases such as cardiovascular disease and inflammatory bowel disease, which may function in part due to abrogation of chronic systemic inflammation induced by factors such as dysbiotic communities.
Abstract: A higher intake of dietary fiber is associated with a decreased risk of chronic inflammatory diseases such as cardiovascular disease and inflammatory bowel disease. This may function in part due to abrogation of chronic systemic inflammation induced by factors such as dysbiotic gut communities. Data regarding the detailed influences of long-term and recent intake of differing dietary fiber sources on the human gut microbiome are lacking. In a cohort of 307 generally healthy men, we examined gut microbiomes, profiled by shotgun metagenomic and metatranscriptomic sequencing, and long-term and recent dietary fiber intake in relation to plasma levels of C-reactive protein (CRP), an established biomarker for chronic inflammation. Data were analyzed using multivariate linear mixed models. We found that inflammation-associated gut microbial configurations corresponded with higher CRP levels. A greater intake of dietary fiber was associated with shifts in gut microbiome composition, particularly Clostridiales, and their potential for carbohydrate utilization via polysaccharide degradation. This was particularly true for fruit fiber sources (i.e., pectin). Most striking, fiber intake was associated with significantly greater CRP reduction in individuals without substantial Prevotella copri carriage in the gut, whereas those with P. copri carriage maintained stable CRP levels regardless of fiber intake. Our findings offer human evidence supporting a fiber-gut microbiota interaction, as well as a potential specific mechanism by which gut-mediated systemic inflammation may be mitigated.

Journal ArticleDOI
TL;DR: In the You DNA, Your Say survey as discussed by the authors, the authors examined how participants perceived the relative value of measures to demonstrate the trustworthiness of those using donated DNA and/or medical information.
Abstract: Public trust is central to the collection of genomic and health data and the sustainability of genomic research. To merit trust, those involved in collecting and sharing data need to demonstrate they are trustworthy. However, it is unclear what measures are most likely to demonstrate this. We analyse the ‘Your DNA, Your Say’ online survey of public perspectives on genomic data sharing including responses from 36,268 individuals across 22 low-, middle- and high-income countries, gathered in 15 languages. We examine how participants perceived the relative value of measures to demonstrate the trustworthiness of those using donated DNA and/or medical information. We examine between-country variation and present a consolidated ranking of measures. Providing transparent information about who will benefit from data access was the most important measure to increase trust, endorsed by more than 50% of participants across 20 of 22 countries. It was followed by the option to withdraw data and transparency about who is using data and why. Variation was found for the importance of measures, notably information about sanctions for misuse of data—endorsed by 5% in India but almost 60% in Japan. A clustering analysis suggests alignment between some countries in the assessment of specific measures, such as the UK and Canada, Spain and Mexico and Portugal and Brazil. China and Russia are less closely aligned with other countries in terms of the value of the measures presented. Our findings highlight the importance of transparency about data use and about the goals and potential benefits associated with data sharing, including to whom such benefits accrue. They show that members of the public value knowing what benefits accrue from the use of data. The study highlights the importance of locally sensitive measures to increase trust as genomic data sharing continues globally.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper performed an in-silico analysis of 2608 plasmids derived from 814 completely sequenced K. pneumoniae (hv-CRKP) strains.
Abstract: Klebsiella pneumoniae, as a global priority pathogen, is well known for its capability of acquiring mobile genetic elements that carry resistance and/or virulence genes. Its virulence plasmid, previously deemed nonconjugative and restricted within hypervirulent K. pneumoniae (hvKP), has disseminated into classic K. pneumoniae (cKP), particularly carbapenem-resistant K. pneumoniae (CRKP), which poses alarming challenges to public health. However, the mechanism underlying its transfer from hvKP to CRKP is unclear. A total of 28 sequence type (ST) 11 bloodstream infection-causing CRKP strains were collected from Ruijin Hospital in Shanghai, China, and used as recipients in conjugation assays. Transconjugants obtained from conjugation assays were confirmed by XbaI and S1 nuclease pulsed-field gel electrophoresis, PCR detection and/or whole-genome sequencing. The plasmid stability of the transconjugants was evaluated by serial culture. Genetically modified strains and constructed mimic virulence plasmids were employed to investigate the mechanisms underlying mobilization. The level of extracellular polysaccharides was measured by mucoviscosity assays and uronic acid quantification. An in silico analysis of 2608 plasmids derived from 814 completely sequenced K. pneumoniae strains available in GenBank was performed to investigate the distribution of putative helper plasmids and mobilizable virulence plasmids. A nonconjugative virulence plasmid was mobilized by the conjugative plasmid belonging to incompatibility group F (IncF) from the hvKP strain into ST11 CRKP strains under low extracellular polysaccharide-producing conditions or by employing intermediate E. coli strains. The virulence plasmid was mobilized via four modes: transfer alone, cotransfer with the conjugative IncF plasmid, hybrid plasmid formation due to two rounds of single-strand exchanges at specific 28-bp fusion sites or homologous recombination. According to the in silico analysis, 31.8% (242) of the putative helper plasmids and 98.8% (84/85) of the virulence plasmids carry the 28-bp fusion site. All virulence plasmids carry the origin of the transfer site. The nonconjugative virulence plasmid in ST11 CRKP strains is putatively mobilized from hvKP or E. coli intermediates with the help of conjugative IncF plasmids. Our findings emphasize the importance of raising public awareness of the rapid dissemination of virulence plasmids and the consistent emergence of hypervirulent carbapenem-resistant K. pneumoniae (hv-CRKP) strains.

Journal ArticleDOI
TL;DR: In this article, the authors highlight the importance and define challenges of proper control tissues and cell populations to exploit cancer epigenomes and summarize recent advances describing mechanisms leading to epigenetic changes in tumorigenesis and briefly discuss advances in investigating their translational potential.
Abstract: Epigenetic alterations are associated with normal biological processes such as aging or differentiation Changes in global epigenetic signatures, together with genetic alterations, are driving events in several diseases including cancer Comparative studies of cancer and healthy tissues found alterations in patterns of DNA methylation, histone posttranslational modifications, and changes in chromatin accessibility Driven by sophisticated, next-generation sequencing-based technologies, recent studies discovered cancer epigenomes to be dominated by epigenetic patterns already present in the cell-of-origin, which transformed into a neoplastic cell Tumor-specific epigenetic changes therefore need to be redefined and factors influencing epigenetic patterns need to be studied to unmask truly disease-specific alterations The underlying mechanisms inducing cancer-associated epigenetic alterations are poorly understood Studies of mutated epigenetic modifiers, enzymes that write, read, or edit epigenetic patterns, or mutated chromatin components, for example oncohistones, help to provide functional insights on how cancer epigenomes arise In this review, we highlight the importance and define challenges of proper control tissues and cell populations to exploit cancer epigenomes We summarize recent advances describing mechanisms leading to epigenetic changes in tumorigenesis and briefly discuss advances in investigating their translational potential

Journal ArticleDOI
TL;DR: In this paper, the scavenger receptor MARCO was found to be an unfavorable marker in melanoma and non-small cell lung cancer and was associated with worse prognosis and mesenchymal subtype.
Abstract: Macrophages are the most common infiltrating immune cells in gliomas and play a wide variety of pro-tumor and anti-tumor roles. However, the different subpopulations of macrophages and their effects on the tumor microenvironment remain poorly understood. We combined new and previously published single-cell RNA-seq data from 98,015 single cells from a total of 66 gliomas to profile 19,331 individual macrophages. Unsupervised clustering revealed a pro-tumor subpopulation of bone marrow-derived macrophages characterized by the scavenger receptor MARCO, which is almost exclusively found in IDH1-wild-type glioblastomas. Previous studies have implicated MARCO as an unfavorable marker in melanoma and non-small cell lung cancer; here, we find that bulk MARCO expression is associated with worse prognosis and mesenchymal subtype. Furthermore, MARCO expression is significantly altered over the course of treatment with anti-PD1 checkpoint inhibitors in a response-dependent manner, which we validate with immunofluorescence imaging. These findings illustrate a novel macrophage subpopulation that drives tumor progression in glioblastomas and suggest potential therapeutic targets to prevent their recruitment.

Journal ArticleDOI
TL;DR: In this paper, the authors used direct RNA sequencing to analyse transcript expression from the ChAdOx1 nCoV-19 genome in human MRC-5 and A549 cell lines that are non-permissive for vector replication alongside the replication permissive cell line, HEK293.
Abstract: ChAdOx1 nCoV-19 is a recombinant adenovirus vaccine against SARS-CoV-2 that has passed phase III clinical trials and is now in use across the globe. Although replication-defective in normal cells, 28 kbp of adenovirus genes is delivered to the cell nucleus alongside the SARS-CoV-2 S glycoprotein gene. We used direct RNA sequencing to analyse transcript expression from the ChAdOx1 nCoV-19 genome in human MRC-5 and A549 cell lines that are non-permissive for vector replication alongside the replication permissive cell line, HEK293. In addition, we used quantitative proteomics to study over time the proteome and phosphoproteome of A549 and MRC5 cells infected with the ChAdOx1 nCoV-19 vaccine. The expected SARS-CoV-2 S coding transcript dominated in all cell lines. We also detected rare S transcripts with aberrant splice patterns or polyadenylation site usage. Adenovirus vector transcripts were almost absent in MRC-5 cells, but in A549 cells, there was a broader repertoire of adenoviral gene expression at very low levels. Proteomically, in addition to S glycoprotein, we detected multiple adenovirus proteins in A549 cells compared to just one in MRC5 cells. Overall, the ChAdOx1 nCoV-19 vaccine’s transcriptomic and proteomic repertoire in cell culture is as expected. The combined transcriptomic and proteomics approaches provide a detailed insight into the behaviour of this important class of vaccine using state-of-the-art techniques and illustrate the potential of this technique to inform future viral vaccine vector design.

Journal ArticleDOI
TL;DR: In this paper, a cross-phenotype analysis of GWAS database (iCPAGdb) was created to identify traits with shared genetic architecture, using linkage disequilibrium (LD) information to accurately capture shared SNPs by proxy, and calculate significance of enrichment.
Abstract: While genome-wide associations studies (GWAS) have successfully elucidated the genetic architecture of complex human traits and diseases, understanding mechanisms that lead from genetic variation to pathophysiology remains an important challenge. Methods are needed to systematically bridge this crucial gap to facilitate experimental testing of hypotheses and translation to clinical utility. Here, we leveraged cross-phenotype associations to identify traits with shared genetic architecture, using linkage disequilibrium (LD) information to accurately capture shared SNPs by proxy, and calculate significance of enrichment. This shared genetic architecture was examined across differing biological scales through incorporating data from catalogs of clinical, cellular, and molecular GWAS. We have created an interactive web database (interactive Cross-Phenotype Analysis of GWAS database (iCPAGdb)) to facilitate exploration and allow rapid analysis of user-uploaded GWAS summary statistics. This database revealed well-known relationships among phenotypes, as well as the generation of novel hypotheses to explain the pathophysiology of common diseases. Application of iCPAGdb to a recent GWAS of severe COVID-19 demonstrated unexpected overlap of GWAS signals between COVID-19 and human diseases, including with idiopathic pulmonary fibrosis driven by the DPP9 locus. Transcriptomics from peripheral blood of COVID-19 patients demonstrated that DPP9 was induced in SARS-CoV-2 compared to healthy controls or those with bacterial infection. Further investigation of cross-phenotype SNPs associated with both severe COVID-19 and other human traits demonstrated colocalization of the GWAS signal at the ABO locus with plasma protein levels of a reported receptor of SARS-CoV-2, CD209 (DC-SIGN). This finding points to a possible mechanism whereby glycosylation of CD209 by ABO may regulate COVID-19 disease severity. Thus, connecting genetically related traits across phenotypic scales links human diseases to molecular and cellular measurements that can reveal mechanisms and lead to novel biomarkers and therapeutic approaches. The iCPAGdb web portal is accessible at http://cpag.oit.duke.edu and the software code at https://github.com/tbalmat/iCPAGdb .

Journal ArticleDOI
TL;DR: In this article, the authors used a meta-analysis to identify 119 new NDD cases (64 de novo variants) through sequencing and international collaborations and combined with published clinical case reports.
Abstract: With the increasing number of genomic sequencing studies, hundreds of genes have been implicated in neurodevelopmental disorders (NDDs). The rate of gene discovery far outpaces our understanding of genotype–phenotype correlations, with clinical characterization remaining a bottleneck for understanding NDDs. Most disease-associated Mendelian genes are members of gene families, and we hypothesize that those with related molecular function share clinical presentations. We tested our hypothesis by considering gene families that have multiple members with an enrichment of de novo variants among NDDs, as determined by previous meta-analyses. One of these gene families is the heterogeneous nuclear ribonucleoproteins (hnRNPs), which has 33 members, five of which have been recently identified as NDD genes (HNRNPK, HNRNPU, HNRNPH1, HNRNPH2, and HNRNPR) and two of which have significant enrichment in our previous meta-analysis of probands with NDDs (HNRNPU and SYNCRIP). Utilizing protein homology, mutation analyses, gene expression analyses, and phenotypic characterization, we provide evidence for variation in 12 HNRNP genes as candidates for NDDs. Seven are potentially novel while the remaining genes in the family likely do not significantly contribute to NDD risk. We report 119 new NDD cases (64 de novo variants) through sequencing and international collaborations and combined with published clinical case reports. We consider 235 cases with gene-disruptive single-nucleotide variants or indels and 15 cases with small copy number variants. Three hnRNP-encoding genes reach nominal or exome-wide significance for de novo variant enrichment, while nine are candidates for pathogenic mutations. Comparison of HNRNP gene expression shows a pattern consistent with a role in cerebral cortical development with enriched expression among radial glial progenitors. Clinical assessment of probands (n = 188–221) expands the phenotypes associated with HNRNP rare variants, and phenotypes associated with variation in the HNRNP genes distinguishes them as a subgroup of NDDs. Overall, our novel approach of exploiting gene families in NDDs identifies new HNRNP-related disorders, expands the phenotypes of known HNRNP-related disorders, strongly implicates disruption of the hnRNPs as a whole in NDDs, and supports that NDD subtypes likely have shared molecular pathogenesis. To date, this is the first study to identify novel genetic disorders based on the presence of disorders in related genes. We also perform the first phenotypic analyses focusing on related genes. Finally, we show that radial glial expression of these genes is likely critical during neurodevelopment. This is important for diagnostics, as well as developing strategies to best study these genes for the development of therapeutics.

Journal ArticleDOI
TL;DR: PRS improve cardiovascular risk stratification early in life when knowledge of later-life risk factors is unavailable, however, by middle age, when many risk factors are known, the improvement attributed to PRS is marginal for the general population.
Abstract: Several polygenic risk scores (PRS) have been developed for cardiovascular risk prediction, but the additive value of including PRS together with conventional risk factors for risk prediction is questionable. This study assesses the clinical utility of including four PRS generated from 194, 46K, 1.5M, and 6M SNPs, along with conventional risk factors, to predict risk of ischemic heart disease (IHD), myocardial infarction (MI), and first MI event on or before age 50 (early MI). A cross-validated logistic regression (LR) algorithm was trained either on ~ 440K European ancestry individuals from the UK Biobank (UKB), or the full UKB population, including as features different combinations of conventional established-at-birth risk factors (ancestry, sex) and risk factors that are non-fixed over an individual’s lifespan (age, BMI, hypertension, hyperlipidemia, diabetes, smoking, family history), with and without also including PRS. The algorithm was trained separately with IHD, MI, and early MI as prediction labels. When LR was trained using risk factors established-at-birth, adding the four PRS significantly improved the area under the curve (AUC) for IHD (0.62 to 0.67) and MI (0.67 to 0.73), as well as for early MI (0.70 to 0.79). When LR was trained using all risk factors, adding the four PRS only resulted in a significantly higher disease prevalence in the 98th and 99th percentiles of both the IHD and MI scores. PRS improve cardiovascular risk stratification early in life when knowledge of later-life risk factors is unavailable. However, by middle age, when many risk factors are known, the improvement attributed to PRS is marginal for the general population.

Journal ArticleDOI
TL;DR: In this article, the authors investigated the potential genomic recognition pattern of H4R3me2s in CRC cells and its effect on CRC progression, and they found that the methyltransferase activity of SMARCA4 is a key epigenetic modulator of CRC.
Abstract: Aberrant changes in epigenetic mechanisms such as histone modifications play an important role in cancer progression. PRMT1 which triggers asymmetric dimethylation of histone H4 on arginine 3 (H4R3me2a) is upregulated in human colorectal cancer (CRC) and is essential for cell proliferation. However, how this dysregulated modification might contribute to malignant transitions of CRC remains poorly understood. In this study, we integrated biochemical assays including protein interaction studies and chromatin immunoprecipitation (ChIP), cellular analysis including cell viability, proliferation, colony formation, and migration assays, clinical sample analysis, microarray experiments, and ChIP-Seq data to investigate the potential genomic recognition pattern of H4R3me2s in CRC cells and its effect on CRC progression. We show that PRMT1 and SMARCA4, an ATPase subunit of the SWI/SNF chromatin remodeling complex, act cooperatively to promote colorectal cancer (CRC) progression. We find that SMARCA4 is a novel effector molecule of PRMT1-mediated H4R3me2a. Mechanistically, we show that H4R3me2a directly recruited SMARCA4 to promote the proliferative, colony-formative, and migratory abilities of CRC cells by enhancing EGFR signaling. We found that EGFR and TNS4 were major direct downstream transcriptional targets of PRMT1 and SMARCA4 in colon cells, and acted in a PRMT1 methyltransferase activity-dependent manner to promote CRC cell proliferation. In vivo, knockdown or inhibition of PRMT1 profoundly attenuated the growth of CRC cells in the C57BL/6 J-ApcMin/+ CRC mice model. Importantly, elevated expression of PRMT1 or SMARCA4 in CRC patients were positively correlated with expression of EGFR and TNS4, and CRC patients had shorter overall survival. These findings reveal a critical interplay between epigenetic and transcriptional control during CRC progression, suggesting that SMARCA4 is a novel key epigenetic modulator of CRC. Our findings thus highlight PRMT1/SMARCA4 inhibition as a potential therapeutic intervention strategy for CRC. PRMT1-mediated H4R3me2a recruits SMARCA4, which promotes colorectal cancer progression by enhancing EGFR signaling.