scispace - formally typeset
Search or ask a question

Showing papers by "Michael Snyder published in 2018"


Journal ArticleDOI
TL;DR: A mathematical expression is derived to compute PrediXcan results using summary data, and the effects of gene expression variation on human phenotypes in 44 GTEx tissues and >100 phenotypes are investigated.
Abstract: Scalable, integrative methods to understand mechanisms that link genetic variants with phenotypes are needed. Here we derive a mathematical expression to compute PrediXcan (a gene mapping approach) results using summary data (S-PrediXcan) and show its accuracy and general robustness to misspecified reference sets. We apply this framework to 44 GTEx tissues and 100+ phenotypes from GWAS and meta-analysis studies, creating a growing public catalog of associations that seeks to capture the effects of gene expression variation on human phenotypes. Replication in an independent cohort is shown. Most of the associations are tissue specific, suggesting context specificity of the trait etiology. Colocalized significant associations in unexpected tissues underscore the need for an agnostic scanning of multiple contexts to improve our ability to detect causal regulatory mechanisms. Monogenic disease genes are enriched among significant associations for related traits, suggesting that smaller alterations of these genes may cause a spectrum of milder phenotypes.

657 citations


Journal ArticleDOI
TL;DR: The potential for combining diverse types of data and the utility of this approach in human health and disease is discussed and examples of data integration to understand, diagnose and inform treatment of diseases, including rare and common diseases as well as cancer and transplant biology.
Abstract: Advances in omics technologies - such as genomics, transcriptomics, proteomics and metabolomics - have begun to enable personalized medicine at an extraordinarily detailed molecular level. Individually, these technologies have contributed medical advances that have begun to enter clinical practice. However, each technology individually cannot capture the entire biological complexity of most human diseases. Integration of multiple technologies has emerged as an approach to provide a more comprehensive view of biology and disease. In this Review, we discuss the potential for combining diverse types of data and the utility of this approach in human health and disease. We provide examples of data integration to understand, diagnose and inform treatment of diseases, including rare and common diseases as well as cancer and transplant biology. Finally, we discuss technical and other challenges to clinical implementation of integrative omics.

589 citations


Journal ArticleDOI
TL;DR: This work frames central issues regarding determination of protein-level variation and PTMs, including some paradoxes present in the field today, and uses this framework to assess existing data and ask the question, "How many distinct primary structures of proteins (proteoforms) are created from the 20,300 human genes?"
Abstract: Despite decades of accumulated knowledge about proteins and their post-translational modifications (PTMs), numerous questions remain regarding their molecular composition and biological function O

516 citations


Journal ArticleDOI
TL;DR: Current and prospective wearable technologies and their progress toward clinical application are reviewed and technologies underlying common, commercially available wearable sensors and early-stage devices and research to support the use of these devices in healthcare are described.
Abstract: Wearable sensors are already impacting healthcare and medicine by enabling health monitoring outside of the clinic and prediction of health events. This paper reviews current and prospective wearable technologies and their progress toward clinical application. We describe technologies underlying common, commercially available wearable sensors and early-stage devices and outline research, when available, to support the use of these devices in healthcare. We cover applications in the following health areas: metabolic, cardiovascular and gastrointestinal monitoring; sleep, neurology, movement disorders and mental health; maternal, pre- and neo-natal care; and pulmonary health and environmental exposures. Finally, we discuss challenges associated with the adoption of wearable sensors in the current healthcare ecosystem and discuss areas for future research and development.

313 citations


Journal ArticleDOI
TL;DR: A short-term intervention with an isocaloric low-carbohydrate diet with increased protein content in obese subjects with NAFLD and the resulting alterations in metabolism and the gut microbiota are characterized using a multi-omics approach to highlight the potential of exploring diet-microbiota interactions for treatingNAFLD.

305 citations


Journal ArticleDOI
TL;DR: It is shown that normal human cells generate large extrachromosomal circular DNAs (eccDNAs), most likely the products of excised DNA, that can be transcriptionally active and, thus, may have phenotypic consequences.
Abstract: The human genome is generally organized into stable chromosomes, and only tumor cells are known to accumulate kilobase (kb)-sized extrachromosomal circular DNA elements (eccDNAs). However, it must be expected that kb eccDNAs exist in normal cells as a result of mutations. Here, we purify and sequence eccDNAs from muscle and blood samples from 16 healthy men, detecting ~100,000 unique eccDNA types from 16 million nuclei. Half of these structures carry genes or gene fragments and the majority are smaller than 25 kb. Transcription from eccDNAs suggests that eccDNAs reside in nuclei and recurrence of certain eccDNAs in several individuals implies DNA circularization hotspots. Gene-rich chromosomes contribute to more eccDNAs per megabase and the most transcribed protein-coding gene in muscle, TTN (titin), provides the most eccDNAs per gene. Thus, somatic genomes are rich in chromosome-derived eccDNAs that may influence phenotypes through altered gene copy numbers and transcription of full-length or truncated genes. Somatic cells can accumulate structural variations such as deletions. Here, Moller et al. show that normal human cells generate large extrachromosomal circular DNAs (eccDNAs), most likely the products of excised DNA, that can be transcriptionally active and, thus, may have phenotypic consequences.

193 citations


Journal ArticleDOI
TL;DR: A controlled longitudinal weight perturbation study combining multiple omics strategies during periods of weight gain and loss in humans demonstrated that weight gain is associated with the activation of strong inflammatory and hypertrophic cardiomyopathy signatures in blood.
Abstract: Advances in omics technologies now allow an unprecedented level of phenotyping for human diseases, including obesity, in which individual responses to excess weight are heterogeneous and unpredictable. To aid the development of better understanding of these phenotypes, we performed a controlled longitudinal weight perturbation study combining multiple omics strategies (genomics, transcriptomics, multiple proteomics assays, metabolomics, and microbiomics) during periods of weight gain and loss in humans. Results demonstrated that: (1) weight gain is associated with the activation of strong inflammatory and hypertrophic cardiomyopathy signatures in blood; (2) although weight loss reverses some changes, a number of signatures persist, indicative of long-term physiologic changes; (3) we observed omics signatures associated with insulin resistance that may serve as novel diagnostics; (4) specific biomolecules were highly individualized and stable in response to perturbations, potentially representing stable personalized markers. Most data are available open access and serve as a valuable resource for the community.

159 citations


Journal ArticleDOI
TL;DR: It is shown that glucose dysregulation, as characterized by CGM, is more prevalent and heterogeneous than previously thought and can affect individuals considered normoglycemic by standard measures, and specific patterns of glycemic responses reflect variable underlying physiology.
Abstract: Diabetes is an increasing problem worldwide; almost 30 million people, nearly 10% of the population, in the United States are diagnosed with diabetes. Another 84 million are prediabetic, and without intervention, up to 70% of these individuals may progress to type 2 diabetes. Current methods for quantifying blood glucose dysregulation in diabetes and prediabetes are limited by reliance on single-time-point measurements or on average measures of overall glycemia and neglect glucose dynamics. We have used continuous glucose monitoring (CGM) to evaluate the frequency with which individuals demonstrate elevations in postprandial glucose, the types of patterns, and how patterns vary between individuals given an identical nutrient challenge. Measurement of insulin resistance and secretion highlights the fact that the physiology underlying dysglycemia is highly variable between individuals. We developed an analytical framework that can group individuals according to specific patterns of glycemic responses called “glucotypes” that reveal heterogeneity, or subphenotypes, within traditional diagnostic categories of glucose regulation. Importantly, we found that even individuals considered normoglycemic by standard measures exhibit high glucose variability using CGM, with glucose levels reaching prediabetic and diabetic ranges 15% and 2% of the time, respectively. We thus show that glucose dysregulation, as characterized by CGM, is more prevalent and heterogeneous than previously thought and can affect individuals considered normoglycemic by standard measures, and specific patterns of glycemic responses reflect variable underlying physiology. The interindividual variability in glycemic responses to standardized meals also highlights the personal nature of glucose regulation. Through extensive phenotyping, we developed a model for identifying potential mechanisms of personal glucose dysregulation and built a webtool for visualizing a user-uploaded CGM profile and classifying individualized glucose patterns into glucotypes.

147 citations


Journal ArticleDOI
TL;DR: A rapid magnet-based phenotypic screening strategy is developed, and eight genome-wide CRISPR screens in human cells are performed to identify genes regulating phagocytosis of distinct substrates, highlighting roles for NHLRC2 in filopodia formation, very-long-chain fatty acids in substrate-specific phagocytes and TM2D3 in uptake of amyloid-β aggregates.
Abstract: Phagocytosis is required for a broad range of physiological functions, from pathogen defense to tissue homeostasis, but the mechanisms required for phagocytosis of diverse substrates remain incompletely understood. Here, we developed a rapid magnet-based phenotypic screening strategy, and performed eight genome-wide CRISPR screens in human cells to identify genes regulating phagocytosis of distinct substrates. After validating select hits in focused miniscreens, orthogonal assays and primary human macrophages, we show that (1) the previously uncharacterized gene NHLRC2 is a central player in phagocytosis, regulating RhoA-Rac1 signaling cascades that control actin polymerization and filopodia formation, (2) very-long-chain fatty acids are essential for efficient phagocytosis of certain substrates and (3) the previously uncharacterized Alzheimer's disease-associated gene TM2D3 can preferentially influence uptake of amyloid-β aggregates. These findings illuminate new regulators and core principles of phagocytosis, and more generally establish an efficient method for unbiased identification of cellular uptake mechanisms across diverse physiological and pathological contexts.

126 citations


Journal ArticleDOI
Chao Jiang1, Wang Xin1, Xiyan Li1, Jingga Inlora1, Ting Wang1, Qing Liu1, Michael Snyder1 
20 Sep 2018-Cell
TL;DR: It is demonstrated that human exposomes are diverse, dynamic, spatiotemporally-driven interaction networks with the potential to impact human health.

116 citations


Journal ArticleDOI
TL;DR: This model not only significantly increased predictive power by combining all datasets, but also revealed novel interactions between different biological modalities, which provides the frameworks for future studies examining deviations implicated in pregnancy‐related pathologies including preterm birth and preeclampsia.
Abstract: Motivation Multiple biological clocks govern a healthy pregnancy. These biological mechanisms produce immunologic, metabolomic, proteomic, genomic and microbiomic adaptations during the course of pregnancy. Modeling the chronology of these adaptations during full-term pregnancy provides the frameworks for future studies examining deviations implicated in pregnancy-related pathologies including preterm birth and preeclampsia.

Journal ArticleDOI
TL;DR: An integrative analysis of tumor whole genomes and matched transcriptomes finds that the effects of noncoding mutations on DAAM1, MTG2 and HYI transcription are recapitulated in multiple cancer cell lines and that increasingDAAM1 expression leads to invasive cell migration.
Abstract: Although cancer genomes are replete with noncoding mutations, the effects of these mutations remain poorly characterized. Here we perform an integrative analysis of 930 tumor whole genomes and matched transcriptomes, identifying a network of 193 noncoding loci in which mutations disrupt target gene expression. These 'somatic eQTLs' (expression quantitative trait loci) are frequently mutated in specific cancer tissues, and the majority can be validated in an independent cohort of 3,382 tumors. Among these, we find that the effects of noncoding mutations on DAAM1, MTG2 and HYI transcription are recapitulated in multiple cancer cell lines and that increasing DAAM1 expression leads to invasive cell migration. Collectively, the noncoding loci converge on a set of core pathways, permitting a classification of tumors into pathway-based subtypes. The somatic eQTL network is disrupted in 88% of tumors, suggesting widespread impact of noncoding mutations in cancer.

Journal ArticleDOI
TL;DR: While the untargeted and targeted platforms detect similar numbers of lipids, the former identifies a broader range of lipid classes and can unambiguously identify all three fatty acids in triacylglycerols (TAG), suggesting that TAG metabolism is particularly sensitive to the aging process in mice.
Abstract: Lipidomics - the global assessment of lipids - can be performed using a variety of mass spectrometry (MS)-based approaches. However, choosing the optimal approach in terms of lipid coverage, robustness and throughput can be a challenging task. Here, we compare a novel targeted quantitative lipidomics platform known as the Lipidyzer to a conventional untargeted liquid chromatography (LC)-MS approach. We find that both platforms are efficient in profiling more than 300 lipids across 11 lipid classes in mouse plasma with precision and accuracy below 20% for most lipids. While the untargeted and targeted platforms detect similar numbers of lipids, the former identifies a broader range of lipid classes and can unambiguously identify all three fatty acids in triacylglycerols (TAG). Quantitative measurements from both approaches exhibit a median correlation coefficient (r) of 0.99 using a dilution series of deuterated internal standards and 0.71 using endogenous plasma lipids in the context of aging. Application of both platforms to plasma from aging mouse reveals similar changes in total lipid levels across all major lipid classes and in specific lipid species. Interestingly, TAG is the lipid class that exhibits the most changes with age, suggesting that TAG metabolism is particularly sensitive to the aging process in mice. Collectively, our data show that the Lipidyzer platform provides comprehensive profiling of the most prevalent lipids in plasma in a simple and automated manner.

Journal ArticleDOI
TL;DR: The Human Proteome Project annually reports on progress throughout the field in credibly identifying and characterizing the human protein parts list and making proteomics an integral part of multiomics studies in medicine and the life sciences.
Abstract: The Human Proteome Project (HPP) annually reports on progress throughout the field in credibly identifying and characterizing the human protein parts list and making proteomics an integral part of multiomics studies in medicine and the life sciences. NeXtProt release 2018-01-17, the baseline for this sixth annual HPP special issue of the Journal of Proteome Research, contains 17 470 PE1 proteins, 89% of all neXtProt predicted PE1-4 proteins, up from 17 008 in release 2017-01-23 and 13 975 in release 2012-02-24. Conversely, the number of neXtProt PE2,3,4 missing proteins has been reduced from 2949 to 2579 to 2186 over the past two years. Of the PE1 proteins, 16 092 are based on mass spectrometry results, and 1378 on other kinds of protein studies, notably protein-protein interaction findings. PeptideAtlas has 15 798 canonical proteins, up 625 over the past year, including 269 from SUMOylation studies. The largest reason for missing proteins is low abundance. Meanwhile, the Human Protein Atlas has released its Cell Atlas, Pathology Atlas, and updated Tissue Atlas, and is applying recommendations from the International Working Group on Antibody Validation. Finally, there is progress using the quantitative multiplex organ-specific popular proteins targeted proteomics approach in various disease categories.

Journal ArticleDOI
06 Sep 2018-Cell
TL;DR: A machine-learning framework to integrate personal genomes and electronic health record (EHR) data is developed and used to study abdominal aortic aneurysm, a prevalent irreversible cardiovascular disease with unclear etiology.

Journal ArticleDOI
TL;DR: A new droplet-based method, sparse isoform sequencing (spISO-seq), sequences 100k-200k partitions of 10-200 molecules at a time, enabling analysis of 10 to 100 million RNA molecules, providing a more comprehensive understanding of the human transcriptome and a general, cost-effective method to analyze it.
Abstract: Understanding transcriptome complexity is crucial for understanding human biology and disease. Technologies such as Synthetic long-read RNA sequencing (SLR-RNA-seq) delivered 5 million isoforms and allowed assessing splicing coordination. Pacific Biosciences and Oxford Nanopore increase throughput also but require high input amounts or amplification. Our new droplet-based method, sparse isoform sequencing (spISO-seq), sequences 100k–200k partitions of 10–200 molecules at a time, enabling analysis of 10–100 million RNA molecules. SpISO-seq requires less than 1 ng of input cDNA, limiting or removing the need for prior amplification with its associated biases. Adjusting the number of reads devoted to each molecule reduces sequencing lanes and cost, with little loss in detection power. The increased number of molecules expands our understanding of isoform complexity. In addition to confirming our previously published cases of splicing coordination (e.g., BIN1), the greater depth reveals many new cases, such as MAPT. Coordination of internal exons is found to be extensive among protein coding genes: 23.5%–59.3% (95% confidence interval) of highly expressed genes with distant alternative exons exhibit coordination, showcasing the need for long-read transcriptomics. However, coordination is less frequent for noncoding sequences, suggesting a larger role of splicing coordination in shaping proteins. Groups of genes with coordination are involved in protein–protein interactions with each other, raising the possibility that coordination facilitates complex formation and/or function. We also find new splicing coordination types, involving initial and terminal exons. Our results provide a more comprehensive understanding of the human transcriptome and a general, cost-effective method to analyze it.

Journal ArticleDOI
TL;DR: This study used long-read sequencing for the analysis of pseudorabies virus (PRV) transcriptome, including Oxford Nanopore Technologies MinION, PacBio RS-II, and Illumina HiScanSQ platforms, and revealed 145 upstream ORFs, many of which are located on the longer 5′ isoforms of the transcripts.
Abstract: Third-generation sequencing is an emerging technology that is capable of solving several problems that earlier approaches were not able to, including the identification of transcripts isoforms and overlapping transcripts In this study, we used long-read sequencing for the analysis of pseudorabies virus (PRV) transcriptome, including Oxford Nanopore Technologies MinION, PacBio RS-II, and Illumina HiScanSQ platforms We also used data from our previous short-read and long-read sequencing studies for the comparison of the results and in order to confirm the obtained data Our investigations identified 19 formerly unknown putative protein-coding genes, all of which are 5' truncated forms of earlier annotated longer PRV genes Additionally, we detected 19 non-coding RNAs, including 5' and 3' truncated transcripts without in-frame ORFs, antisense RNAs, as well as RNA molecules encoded by those parts of the viral genome where no transcription had been detected before This study has also led to the identification of three complex transcripts and 50 distinct length isoforms, including transcription start and end variants We also detected 121 novel transcript overlaps, and two transcripts that overlap the replication origins of PRV Furthermore, in silico analysis revealed 145 upstream ORFs, many of which are located on the longer 5' isoforms of the transcripts

Journal ArticleDOI
Monika Oláhová1, Wan Hee Yoon2, Kyle Thompson1, Sharayu Jangam2  +211 moreInstitutions (12)
TL;DR: Two individuals, each with homozygous missense variants in ATP5F1D, who presented with episodic lethargy, metabolic acidosis, 3-methylglutaconic aciduria, and hyperammonemia are described, establishing c.245C>T and c.317T>G as pathogenic variants leading to a Mendelian mitochondrial disease featuring episodic metabolic decompensation.
Abstract: ATP synthase, H+ transporting, mitochondrial F1 complex, δ subunit (ATP5F1D; formerly ATP5D) is a subunit of mitochondrial ATP synthase and plays an important role in coupling proton translocation and ATP production. Here, we describe two individuals, each with homozygous missense variants in ATP5F1D, who presented with episodic lethargy, metabolic acidosis, 3-methylglutaconic aciduria, and hyperammonemia. Subject 1, homozygous for c.245C>T (p.Pro82Leu), presented with recurrent metabolic decompensation starting in the neonatal period, and subject 2, homozygous for c.317T>G (p.Val106Gly), presented with acute encephalopathy in childhood. Cultured skin fibroblasts from these individuals exhibited impaired assembly of F1FO ATP synthase and subsequent reduced complex V activity. Cells from subject 1 also exhibited a significant decrease in mitochondrial cristae. Knockdown of Drosophila ATPsynδ, the ATP5F1D homolog, in developing eyes and brains caused a near complete loss of the fly head, a phenotype that was fully rescued by wild-type human ATP5F1D. In contrast, expression of the ATP5F1D c.245C>T and c.317T>G variants rescued the head-size phenotype but recapitulated the eye and antennae defects seen in other genetic models of mitochondrial oxidative phosphorylation deficiency. Our data establish c.245C>T (p.Pro82Leu) and c.317T>G (p.Val106Gly) in ATP5F1D as pathogenic variants leading to a Mendelian mitochondrial disease featuring episodic metabolic decompensation.

Journal ArticleDOI
TL;DR: It is found that DNA methylomic changes are associated with infrequent glucose level alteration, whereas the transcriptome underwent dynamic changes during events such as viral infections, while most DNA meta-methylome changes occurred 80–90 days before clinically detectable glucose elevation.
Abstract: Epigenomics regulates gene expression and is as important as genomics in precision personal health, as it is heavily influenced by environment and lifestyle. We profiled whole-genome DNA methylation and the corresponding transcriptome of peripheral blood mononuclear cells collected from a human volunteer over a period of 36 months, generating 28 methylome and 57 transcriptome datasets. We found that DNA methylomic changes are associated with infrequent glucose level alteration, whereas the transcriptome underwent dynamic changes during events such as viral infections. Most DNA meta-methylome changes occurred 80–90 days before clinically detectable glucose elevation. Analysis of the deep personal methylome dataset revealed an unprecedented number of allelic differentially methylated regions that remain stable longitudinally and are preferentially associated with allele-specific gene regulation. Our results revealed that changes in different types of ‘omics’ data associate with different physiological aspects of this individual: DNA methylation with chronic conditions and transcriptome with acute events.

Journal ArticleDOI
TL;DR: Methods and opportunities to bridge genome and dynamic physiology, detect disease at an early stage, and uncover lifestyle and environmental patterns associated with the disease are discussed.
Abstract: The convergence of scientific capability and technology that generates vast health data at diminishing cost has generated opportunities, challenges, and anticipation surrounding future data-centric healthcare models. Individualized health data spanning biomolecular, physiological, and environmental dimensions comprise a personal omics profile. Here, we discuss methods and opportunities to bridge genome and dynamic physiology, detect disease at an early stage, and uncover lifestyle and environmental patterns associated with the disease. Significant challenges exist to aggregate, integrate, and protect personal omics data to advance our understanding of the disease, enable data-driven clinical decisions, and motivate individuals to sustain behavioral change. Since the first sequencing of the human genome in 2003, the relationship between genetic variants and phenotypes has remained a central challenge in medicine. Many diseases including coronary atherosclerosis are polygenic or indeed omnigenic wherein many variants work together to impact a phenotype.1 Potentially confounding factors and small study population size in comparison to the size of the human genome make it challenging to decipher genetic risk for complex and heterogeneous diseases. To better understand how genetic variation maps to complex traits, simultaneous measurements that bridge genotype and phenotype are required. This deep phenotyping is the goal of personal omics profiling,2 which combines measures of the genome, epigenome, transcriptome, proteome, metabolome, and additional omes (Figure [A]). Rapid advances in sequencing and mass spectrometry drive continued improvement in cost, accuracy, and throughput.3,4 Mobile and wearable technologies enable physiological, contextual, and environmental measurements. As we learn more about the symbiotic functions of the microbiome in human health, we also apply multiomic profiling to microbial populations (Figure [B]). Together these measurements provide a holistic profile of dynamic health and facilitate personalized, precision interventions based on predictive models (Figure [E]). Figure. Overview of personal omics. A , Omic measures span …

Journal ArticleDOI
TL;DR: The fetal genetic contribution to PTB is unlikely due to single common genetic variant, but could be explained by interactions of multiple common variants, or of rare variants affected by environmental influences, all not detectable using a GWAS alone.
Abstract: Preterm birth (PTB), or the delivery prior to 37 weeks of gestation, is a significant cause of infant morbidity and mortality. Although twin studies estimate that maternal genetic contributions account for approximately 30% of the incidence of PTB, and other studies reported fetal gene polymorphism association, to date no consistent associations have been identified. In this study, we performed the largest reported genome-wide association study analysis on 1,349 cases of PTB and 12,595 ancestry-matched controls from the focusing on genomic fetal signals. We tested over 2 million single nucleotide polymorphisms (SNPs) for associations with PTB across five subpopulations: African (AFR), the Americas (AMR), European, South Asian, and East Asian. We identified only two intergenic loci associated with PTB at a genome-wide level of significance: rs17591250 (P = 4.55E-09) on chromosome 1 in the AFR population and rs1979081 (P = 3.72E-08) on chromosome 8 in the AMR group. We have queried several existing replication cohorts and found no support of these associations. We conclude that the fetal genetic contribution to PTB is unlikely due to single common genetic variant, but could be explained by interactions of multiple common variants, or of rare variants affected by environmental influences, all not detectable using a GWAS alone.

Journal ArticleDOI
TL;DR: How SETD7 acts at sequential steps in cardiac lineage commitment is revealed, and insights into crosstalk between dynamic epigenetic marks and chromatin-modifying enzymes are provided.

Journal ArticleDOI
TL;DR: A large dataset is generated that can serve as a valuable resource for the investigation of the dynamic VACV transcriptome, the virus-host interactions, and RNA base modifications and can provide useful information for novel gene annotations in the VacV genome.
Abstract: Background Poxviruses are large DNA viruses that infect humans and animals. Vaccinia virus (VACV) has been applied as a live vaccine for immunization against smallpox, which was eradicated by 1980 as a result of worldwide vaccination. VACV is the prototype of poxviruses in the investigation of the molecular pathogenesis of the virus. Short-read sequencing methods have revolutionized transcriptomics; however, they are not efficient in distinguishing between the RNA isoforms and transcript overlaps. Long-read sequencing (LRS) is much better suited to solve these problems and also allow direct RNA sequencing. Despite the scientific relevance of VACV, no LRS data have been generated for the viral transcriptome to date. Findings For the deep characterization of the VACV RNA profile, various LRS platforms and library preparation approaches were applied. The raw reads were mapped to the VACV reference genome and also to the host (Chlorocebus sabaeus) genome. In this study, we applied the Pacific Biosciences RSII and Sequel platforms, which altogether resulted in 937,531 mapped reads of inserts (1.42 Gb), while we obtained 2,160,348 aligned reads (1.75 Gb) from the different library preparation methods using the MinION device from Oxford Nanopore Technologies. Conclusions By applying cutting-edge technologies, we were able to generate a large dataset that can serve as a valuable resource for the investigation of the dynamic VACV transcriptome, the virus-host interactions, and RNA base modifications. These data can provide useful information for novel gene annotations in the VACV genome. Our dataset can also be used to analyze the currently available LRS platforms, library preparation methods, and bioinformatics pipelines.

Journal ArticleDOI
TL;DR: In conclusion, cytokine profiling following exercise may help differentiate patients with ME/CFS from sedentary controls.
Abstract: Myalgic Encephalomyelitis or Chronic Fatigue Syndrome (ME/CFS) is a heterogeneous syndrome in which patients often experience severe fatigue and malaise following exertion. Immune and cardiovascular dysfunction have been postulated to play a role in the pathophysiology. We therefore, examined whether cytokine profiling or cardiovascular testing following exercise would differentiate patients with ME/CFS. Twenty-four ME/CFS patients were matched to 24 sedentary controls and underwent cardiovascular and circulating immune profiling. Cardiovascular analysis included echocardiography, cardiopulmonary exercise and endothelial function testing. Cytokine and growth factor profiles were analyzed using a 51-plex Luminex bead kit at baseline and 18 hours following exercise. Cardiac structure and exercise capacity were similar between groups. Sparse partial least square discriminant analyses of cytokine profiles 18 hours post exercise offered the most reliable discrimination between ME/CFS and controls (κ = 0.62(0.34,0.84)). The most discriminatory cytokines post exercise were CD40L, platelet activator inhibitor, interleukin 1-β, interferon-α and CXCL1. In conclusion, cytokine profiling following exercise may help differentiate patients with ME/CFS from sedentary controls.

Journal ArticleDOI
TL;DR: Functional evidence is provided that genetic variation is associated with dysregulated LMOD1 expression/function in SMCs, together contributing to the heritable risk for CAD.
Abstract: Recent genome-wide association studies (GWAS) have identified multiple new loci which appear to alter coronary artery disease (CAD) risk via arterial wall-specific mechanisms. One of the annotated genes encodes LMOD1 (Leiomodin 1), a member of the actin filament nucleator family that is highly enriched in smooth muscle-containing tissues such as the artery wall. However, it is still unknown whether LMOD1 is the causal gene at this locus and also how the associated variants alter LMOD1 expression/function and CAD risk. Using epigenomic profiling we recently identified a non-coding regulatory variant, rs34091558, which is in tight linkage disequilibrium (LD) with the lead CAD GWAS variant, rs2820315. Herein we demonstrate through expression quantitative trait loci (eQTL) and statistical fine-mapping in GTEx, STARNET, and human coronary artery smooth muscle cell (HCASMC) datasets, rs34091558 is the top regulatory variant for LMOD1 in vascular tissues. Position weight matrix (PWM) analyses identify the protective allele rs34091558-TA to form a conserved Forkhead box O3 (FOXO3) binding motif, which is disrupted by the risk allele rs34091558-A. FOXO3 chromatin immunoprecipitation and reporter assays show reduced FOXO3 binding and LMOD1 transcriptional activity by the risk allele, consistent with effects of FOXO3 downregulation on LMOD1. LMOD1 knockdown results in increased proliferation and migration and decreased cell contraction in HCASMC, and immunostaining in atherosclerotic lesions in the SMC lineage tracing reporter mouse support a key role for LMOD1 in maintaining the differentiated SMC phenotype. These results provide compelling functional evidence that genetic variation is associated with dysregulated LMOD1 expression/function in SMCs, together contributing to the heritable risk for CAD.

Journal ArticleDOI
TL;DR: Substantial evolutionary innovation in PGR is revealed even during very recent human evolution, and its different forms among human populations likely result in differential susceptibility to progesterone-associated disease conditions including preterm birth.
Abstract: The progesterone receptor (PGR) plays a central role in maintaining pregnancy and is significantly associated with medical conditions such as preterm birth that affects 12.6% of all the births in U.S. PGR has been evolving rapidly since the common ancestor of human and chimpanzee, and we herein investigated evolutionary dynamics of PGR during recent human migration and population differentiation. Our study revealed substantial population differentiation at the PGR locus driven by natural selection, where very recent positive selection in East Asians has substantially decreased its genetic diversity by nearly fixing evolutionarily novel alleles. On the contrary, in European populations, the PGR locus has been promoted to a highly polymorphic state likely due to balancing selection. Integrating transcriptome data across multiple tissue types together with large-scale genome-wide association data for preterm birth, our study demonstrated the consequence of the selection event in East Asians on remodeling PGR expression specifically in the ovary and determined a significant association of early spontaneous preterm birth with the evolutionarily selected variants. To reconstruct its evolutionary trajectory on the human lineage, we observed substantial differentiation between modern and archaic humans at the PGR locus, including fixation of a deleterious missense allele in the Neanderthal genome that was later introgressed in modern human populations. Taken together, our study revealed substantial evolutionary innovation in PGR even during very recent human evolution, and its different forms among human populations likely result in differential susceptibility to progesterone-associated disease conditions including preterm birth.

Journal ArticleDOI
TL;DR: This work presents its perspective on the role of proteomics and other Omics in precision health and medicine and suggests that proteomics is well-positioned to contribute to health discoveries and management.
Abstract: It is now possible to collect large sums of health-related data which has the potential to transform healthcare. Proteomics, with its central position as downstream of genetics and epigenetic inputs and upstream of biochemical outputs and integrators of environmental signals, is well-positioned to contribute to health discoveries and management. We present our perspective on the role of proteomics and other Omics in precision health and medicine.

Journal ArticleDOI
28 Mar 2018-PLOS ONE
TL;DR: The role of NF90 and splice variant NF110 in regulating transcription as chromatin-interacting proteins has not been comprehensively characterized in this paper, however, the authors have shown that NF90/NF110 occupancy colocalized with chromatin marks associated with active promoters and strong enhancers.
Abstract: NF90 and splice variant NF110 are DNA- and RNA-binding proteins encoded by the Interleukin enhancer-binding factor 3 (ILF3) gene that have been established to regulate RNA splicing, stabilization and export. The roles of NF90 and NF110 in regulating transcription as chromatin-interacting proteins have not been comprehensively characterized. Here, chromatin immunoprecipitation followed by deep sequencing (ChIP-seq) identified 9,081 genomic sites specifically occupied by NF90/NF110 in K562 cells. One third of NF90/NF110 peaks occurred at promoters of annotated genes. NF90/NF110 occupancy colocalized with chromatin marks associated with active promoters and strong enhancers. Comparison with 150 ENCODE ChIP-seq experiments revealed that NF90/NF110 clustered with transcription factors exhibiting preference for promoters over enhancers (POLR2A, MYC, YY1). Differential gene expression analysis following shRNA knockdown of NF90/NF110 in K562 cells revealed that NF90/NF110 activates transcription factors that drive growth and proliferation (EGR1, MYC), while attenuating differentiation along the erythroid lineage (KLF1). NF90/NF110 associates with chromatin to hierarchically regulate transcription factors that promote proliferation and suppress differentiation.

Journal ArticleDOI
TL;DR: A large RNA-Seq dataset, derived from both short- and long-read sequencing, is presented, which can be used to compare different sequencing approaches, library preparation methods, as well as for validation and testing bioinformatic pipelines.
Abstract: Pseudorabies virus (PRV) is an alphaherpesvirus of swine. PRV has a large double-stranded DNA genome and, as the latest investigations have revealed, a very complex transcriptome. Here, we present a large RNA-Seq dataset, derived from both short- and long-read sequencing. The dataset contains 1.3 million 100 bp paired-end reads that were obtained from the Illumina random-primed libraries, as well as 10 million 50 bp single-end reads generated by the Illumina polyA-seq. The Pacific Biosciences RSII non-amplified method yielded 57,021 reads of inserts (ROIs) aligned to the viral genome, the amplified method resulted in 158,396 PRV-specific ROIs, while we obtained 12,555 ROIs using the Sequel platform. The Oxford Nanopore's MinION device generated 44,006 reads using their regular cDNA-sequencing method, whereas 29,832 and 120,394 reads were produced by using the direct RNA-sequencing and the Cap-selection protocols, respectively. The raw reads were aligned to the PRV reference genome (KJ717942.1). Our provided dataset can be used to compare different sequencing approaches, library preparation methods, as well as for validation and testing bioinformatic pipelines.

Journal ArticleDOI
17 Dec 2018
TL;DR: The frequency of actionable variants is higher than that reported in most previous studies and suggests added benefit from utilizing expanded gene lists and manual curation to assess actionable findings.
Abstract: Exome sequencing is increasingly utilized in both clinical and nonclinical settings, but little is known about its utility in healthy individuals. Most previous studies on this topic have examined a small subset of genes known to be implicated in human disease and/or have used automated pipelines to assess pathogenicity of known variants. To determine the frequency of both medically actionable and nonactionable but medically relevant exome findings in the general population we assessed the exomes of 70 participants who have been extensively characterized over the past several years as part of a longitudinal integrated multiomics profiling study. We analyzed exomes by identifying rare likely pathogenic and pathogenic variants in genes associated with Mendelian disease in the Online Mendelian Inheritance in Man (OMIM) database. We then used American College of Medical Genetics (ACMG) guidelines for the classification of rare sequence variants. Additionally, we assessed pharmacogenetic variants. Twelve out of 70 (17%) participants had medically actionable findings in Mendelian disease genes. Five had phenotypes or family histories associated with their genetic variants. The frequency of actionable variants is higher than that reported in most previous studies and suggests added benefit from utilizing expanded gene lists and manual curation to assess actionable findings. A total of 63 participants (90%) had additional nonactionable findings, including 60 who were found to be carriers for recessive diseases and 21 who have increased Alzheimer's disease risk because of heterozygous or homozygous APOE e4 alleles (18 participants had both). Our results suggest that exome sequencing may have considerably more utility for health management in the general population than previously thought.