scispace - formally typeset
Search or ask a question

Showing papers by "Institute for Systems Biology published in 2017"


Journal ArticleDOI
TL;DR: The ProteomeXchange Consortium of proteomics resources was formally started in 2011 to standardize data submission and dissemination of mass spectrometry proteomics data worldwide and is supporting a change in culture of the proteomics field.
Abstract: The ProteomeXchange (PX) Consortium of proteomics resources (http://www.proteomexchange.org) was formally started in 2011 to standardize data submission and dissemination of mass spectrometry proteomics data worldwide. We give an overview of the current consortium activities and describe the advances of the past few years. Augmenting the PX founding members (PRIDE and PeptideAtlas, including the PASSEL resource), two new members have joined the consortium: MassIVE and jPOST. ProteomeCentral remains as the common data access portal, providing the ability to search for data sets in all participating PX resources, now with enhanced data visualization components.We describe the updated submission guidelines, now expanded to include four members instead of two. As demonstrated by data submission statistics, PX is supporting a change in culture of the proteomics field: public data sharing is now an accepted standard, supported by requirements for journal submissions resulting in public data release becoming the norm. More than 4500 data sets have been submitted to the various PX resources since 2012. Human is the most represented species with approximately half of the data sets, followed by some of the main model organisms and a growing list of more than 900 diverse species. Data reprocessing activities are becoming more prominent, with both MassIVE and PeptideAtlas releasing the results of reprocessed data sets. Finally, we outline the upcoming advances for ProteomeXchange.

754 citations


Journal ArticleDOI
Rebecca Sims1, Sven J. van der Lee2, Adam C. Naj3, Céline Bellenguez4  +484 moreInstitutions (120)
TL;DR: Three new genome-wide significant nonsynonymous variants associated with Alzheimer's disease are observed, providing additional evidence that the microglia-mediated innate immune response contributes directly to the development of Alzheimer's Disease.
Abstract: We identified rare coding variants associated with Alzheimer's disease in a three-stage case–control study of 85,133 subjects. In stage 1, we genotyped 34,174 samples using a whole-exome microarray. In stage 2, we tested associated variants (P < 1 × 10−4) in 35,962 independent samples using de novo genotyping and imputed genotypes. In stage 3, we used an additional 14,997 samples to test the most significant stage 2 associations (P < 5 × 10−8) using imputed genotypes. We observed three new genome-wide significant nonsynonymous variants associated with Alzheimer's disease: a protective variant in PLCG2 (rs72824905: p.Pro522Arg, P = 5.38 × 10−10, odds ratio (OR) = 0.68, minor allele frequency (MAF)cases = 0.0059, MAFcontrols = 0.0093), a risk variant in ABI3 (rs616338: p.Ser209Phe, P = 4.56 × 10−10, OR = 1.43, MAFcases = 0.011, MAFcontrols = 0.008), and a new genome-wide significant variant in TREM2 (rs143332484: p.Arg62His, P = 1.55 × 10−14, OR = 1.67, MAFcases = 0.0143, MAFcontrols = 0.0089), a known susceptibility gene for Alzheimer's disease. These protein-altering changes are in genes highly expressed in microglia and highlight an immune-related protein–protein interaction network enriched for previously identified risk genes in Alzheimer's disease. These genetic findings provide additional evidence that the microglia-mediated innate immune response contributes directly to the development of Alzheimer's disease.

730 citations


Journal ArticleDOI
A. Gordon Robertson1, Juliann Shih2, Juliann Shih3, Christina Yau4  +170 moreInstitutions (23)
TL;DR: Within D3-UM, EIF1AX- and SRSF2/SF3B1-mutant tumors have distinct somatic copy number alterations and DNA methylation profiles, providing insight into the biology of these low- versus intermediate-risk clinical mutation subtypes.

560 citations


Journal ArticleDOI
TL;DR: This study demonstrates that the acquisition of reproducible quantitative proteomics data by multiple labs is achievable, and broadly serves to increase confidence in SWATH-mass spectrometry data acquisition as a reproducible method for large-scale protein quantification.
Abstract: Quantitative proteomics employing mass spectrometry is an indispensable tool in life science research. Targeted proteomics has emerged as a powerful approach for reproducible quantification but is limited in the number of proteins quantified. SWATH-mass spectrometry consists of data-independent acquisition and a targeted data analysis strategy that aims to maintain the favorable quantitative characteristics (accuracy, sensitivity, and selectivity) of targeted proteomics at large scale. While previous SWATH-mass spectrometry studies have shown high intra-lab reproducibility, this has not been evaluated between labs. In this multi-laboratory evaluation study including 11 sites worldwide, we demonstrate that using SWATH-mass spectrometry data acquisition we can consistently detect and reproducibly quantify >4000 proteins from HEK293 cells. Using synthetic peptide dilution series, we show that the sensitivity, dynamic range and reproducibility established with SWATH-mass spectrometry are uniformly achieved. This study demonstrates that the acquisition of reproducible quantitative proteomics data by multiple labs is achievable, and broadly serves to increase confidence in SWATH-mass spectrometry data acquisition as a reproducible method for large-scale protein quantification. SWATH-mass spectrometry consists of a data-independent acquisition and a targeted data analysis strategy that aims to maintain the favorable quantitative characteristics on the scale of thousands of proteins. Here, using data generated by eleven groups worldwide, the authors show that SWATH-MS is capable of generating highly reproducible data across different laboratories.

372 citations


Journal ArticleDOI
TL;DR: The results show that measurement of personal data clouds over time can improve the understanding of health and disease, including early transitions to disease states.
Abstract: Personal data for 108 individuals were collected during a 9-month period, including whole genome sequences; clinical tests, metabolomes, proteomes, and microbiomes at three time points; and daily activity tracking. Using all of these data, we generated a correlation network that revealed communities of related analytes associated with physiology and disease. Connectivity within analyte communities enabled the identification of known and candidate biomarkers (e.g., gamma-glutamyltyrosine was densely interconnected with clinical analytes for cardiometabolic disease). We calculated polygenic scores from genome-wide association studies (GWAS) for 127 traits and diseases, and used these to discover molecular correlates of polygenic risk (e.g., genetic risk for inflammatory bowel disease was negatively correlated with plasma cystine). Finally, behavioral coaching informed by personal data helped participants to improve clinical biomarkers. Our results show that measurement of personal data clouds over time can improve our understanding of health and disease, including early transitions to disease states.

322 citations


Journal ArticleDOI
TL;DR: The objective was to determine molecular phenotypes of asthma by analysing sputum cell transcriptomics from 104 moderate-to-severe asthmatic subjects and 16 nonasthmatic subjects, finding one Th2-high eosinophilic phenotype TAC1, and two non-Th2 phenotypes TAC2 and TAC3, characterised by inflammasome-associated and metabolic/mitochondrial pathways, respectively.
Abstract: Asthma is characterised by heterogeneous clinical phenotypes. Our objective was to determine molecular phenotypes of asthma by analysing sputum cell transcriptomics from 104 moderate-to-severe asthmatic subjects and 16 nonasthmatic subjects. After filtering on the differentially expressed genes between eosinophil- and noneosinophil-associated sputum inflammation, we used unbiased hierarchical clustering on 508 differentially expressed genes and gene set variation analysis of specific gene sets. We defined three transcriptome-associated clusters (TACs): TAC1 (characterised by immune receptors IL33R , CCR3 and TSLPR ), TAC2 (characterised by interferon-, tumour necrosis factor-α- and inflammasome-associated genes) and TAC3 (characterised by genes of metabolic pathways, ubiquitination and mitochondrial function). TAC1 showed the highest enrichment of gene signatures for interleukin-13/T-helper cell type 2 (Th2) and innate lymphoid cell type 2. TAC1 had the highest sputum eosinophilia and exhaled nitric oxide fraction, and was restricted to severe asthma with oral corticosteroid dependency, frequent exacerbations and severe airflow obstruction. TAC2 showed the highest sputum neutrophilia, serum C-reactive protein levels and prevalence of eczema. TAC3 had normal to moderately high sputum eosinophils and better preserved forced expiratory volume in 1 s. Gene–protein coexpression networks from TAC1 and TAC2 extended this molecular classification. We defined one Th2-high eosinophilic phenotype TAC1, and two non-Th2 phenotypes TAC2 and TAC3, characterised by inflammasome-associated and metabolic/mitochondrial pathways, respectively.

257 citations


Journal ArticleDOI
Diane Lefaudeux1, Bertrand De Meulder1, Matthew J. Loza2, Nancy Peffer2  +219 moreInstitutions (21)
TL;DR: Clustering based on clinicophysiologic parameters yielded 4 stable and reproducible clusters of asthmatic patients that associate with different pathobiological pathways.
Abstract: Background Asthma is a heterogeneous disease in which there is a differential response to asthma treatments. This heterogeneity needs to be evaluated so that a personalized management approach can be provided. Objectives We stratified patients with moderate-to-severe asthma based on clinicophysiologic parameters and performed an omics analysis of sputum. Methods Partition-around-medoids clustering was applied to a training set of 266 asthmatic participants from the European Unbiased Biomarkers for the Prediction of Respiratory Diseases Outcomes (U-BIOPRED) adult cohort using 8 prespecified clinic-physiologic variables. This was repeated in a separate validation set of 152 asthmatic patients. The clusters were compared based on sputum proteomics and transcriptomics data. Results Four reproducible and stable clusters of asthmatic patients were identified. The training set cluster T1 consists of patients with well-controlled moderate-to-severe asthma, whereas cluster T2 is a group of patients with late-onset severe asthma with a history of smoking and chronic airflow obstruction. Cluster T3 is similar to cluster T2 in terms of chronic airflow obstruction but is composed of nonsmokers. Cluster T4 is predominantly composed of obese female patients with uncontrolled severe asthma with increased exacerbations but with normal lung function. The validation set exhibited similar clusters, demonstrating reproducibility of the classification. There were significant differences in sputum proteomics and transcriptomics between the clusters. The severe asthma clusters (T2, T3, and T4) had higher sputum eosinophilia than cluster T1, with no differences in sputum neutrophil counts and exhaled nitric oxide and serum IgE levels. Conclusion Clustering based on clinicophysiologic parameters yielded 4 stable and reproducible clusters that associate with different pathobiological pathways.

216 citations


Journal ArticleDOI
TL;DR: Differential regulation by two distinct mSWI/SNF assemblies, BAF and PBAF complexes, enhancers and promoters, respectively, are demonstrated, suggesting that each complex has distinct functions that are perturbed upon BAF47 loss.
Abstract: Perturbations to mammalian SWI/SNF (mSWI/SNF or BAF) complexes contribute to more than 20% of human cancers, with driving roles first identified in malignant rhabdoid tumor, an aggressive pediatric cancer characterized by biallelic inactivation of the core BAF complex subunit SMARCB1 (BAF47). However, the mechanism by which this alteration contributes to tumorigenesis remains poorly understood. We find that BAF47 loss destabilizes BAF complexes on chromatin, absent significant changes in complex assembly or integrity. Rescue of BAF47 in BAF47-deficient sarcoma cell lines results in increased genome-wide BAF complex occupancy, facilitating widespread enhancer activation and opposition of Polycomb-mediated repression at bivalent promoters. We demonstrate differential regulation by two distinct mSWI/SNF assemblies, BAF and PBAF complexes, enhancers and promoters, respectively, suggesting that each complex has distinct functions that are perturbed upon BAF47 loss. Our results demonstrate collaborative mechanisms of mSWI/SNF-mediated gene activation, identifying functions that are co-opted or abated to drive human cancers and developmental disorders.

188 citations


Journal ArticleDOI
TL;DR: The history of the HPPP and the advances of human plasma proteomics in general are reviewed, including several recent achievements, and the latest 2017-04 build of Human Plasma PeptideAtlas is presented, which yields ∼43 million peptide-spectrum matches and 122,730 distinct peptide sequences from 178 individual experiments.
Abstract: Human blood plasma provides a highly accessible window to the proteome of any individual in health and disease. Since its inception in 2002, the Human Proteome Organization’s Human Plasma Proteome Project (HPPP) has been promoting advances in the study and understanding of the full protein complement of human plasma and on determining the abundance and modifications of its components. In 2017, we review the history of the HPPP and the advances of human plasma proteomics in general, including several recent achievements. We then present the latest 2017-04 build of Human Plasma PeptideAtlas, which yields ∼43 million peptide-spectrum matches and 122,730 distinct peptide sequences from 178 individual experiments at a 1% protein-level FDR globally across all experiments. Applying the latest Human Proteome Project Data Interpretation Guidelines, we catalog 3509 proteins that have at least two non-nested uniquely mapping peptides of nine amino acids or more and >1300 additional proteins with ambiguous evidence. ...

176 citations


Journal ArticleDOI
TL;DR: ProteomeTools, a project building molecular and digital tools from the human proteome to facilitate biomedical research, is described and the generation and multimodal liquid chromatography–tandem mass spectrometry analysis of >330,000 synthetic tryptic peptides representing essentially all canonical human gene products is reported.
Abstract: We describe ProteomeTools, a project building molecular and digital tools from the human proteome to facilitate biomedical research. Here we report the generation and multimodal liquid chromatography-tandem mass spectrometry analysis of >330,000 synthetic tryptic peptides representing essentially all canonical human gene products, and we exemplify the utility of these data in several applications. The resource (available at http://www.proteometools.org) will be extended to >1 million peptides, and all data will be shared with the community via ProteomicsDB and ProteomeXchange.

172 citations



Journal ArticleDOI
TL;DR: Inflammasome inhibition using CRID3 prevented airway hyperresponsiveness and airway inflammation (both neutrophilia and eosinophilia) in a mouse model of severe allergic asthma.
Abstract: Background Sputum analysis in asthmatic patients is used to define airway inflammatory processes and might guide therapy. Objective We sought to determine differential gene and protein expression in sputum samples from patients with severe asthma (SA) compared with nonsmoking patients with mild/moderate asthma. Methods Induced sputum was obtained from nonsmoking patients with SA, smokers/ex-smokers with severe asthma, nonsmoking patients with mild/moderate asthma (MMAs), and healthy nonsmoking control subjects. Differential cell counts, microarray analysis of cell pellets, and SOMAscan analysis of sputum analytes were performed. CRID3 was used to inhibit the inflammasome in a mouse model of SA. Results Eosinophilic and mixed neutrophilic/eosinophilic inflammation were more prevalent in patients with SA compared with MMAs. Forty-two genes probes were upregulated (>2-fold) in nonsmoking patients with severe asthma compared with MMAs, including IL-1 receptor (IL-1R) family and nucleotide-binding oligomerization domain, leucine-rich repeat and pyrin domain containing 3 (NRLP3) inflammasome members (false discovery rate H 2 signature and IL-1 receptor–like 1 (IL1RL1) mRNA expression. These differences were sputum specific because no activation of NLRP3 or enrichment of IL-1R family genes in bronchial brushings or biopsy specimens in patients with SA was observed. Expression of NLRP3 and of the IL-1R family genes was validated in the Airway Disease Endotyping for Personalized Therapeutics cohort. Inflammasome inhibition using CRID3 prevented airway hyperresponsiveness and airway inflammation (both neutrophilia and eosinophilia) in a mouse model of severe allergic asthma. Conclusion IL1RL1 gene expression is associated with eosinophilic SA, whereas NLRP3 inflammasome expression is highest in patients with neutrophilic SA. T H 2-driven eosinophilic inflammation and neutrophil-associated inflammasome activation might represent interacting pathways in patients with SA.

Journal ArticleDOI
TL;DR: An inference scheme using the currently available inflammatory biomarkers sputum eosinophilia and fractional exhaled nitric oxide levels, along with oral corticosteroid use, is described that could predict the subtypes of gene expression within bronchial biopsies and epithelial cells with good sensitivity and specificity.
Abstract: Rationale and objectives: Asthma is a heterogeneous disease driven by diverse immunologic and inflammatory mechanisms. We used transcriptomic profiling of airway tissues to help define asthma phenotypes. Methods: The transcriptome from bronchial biopsies and epithelial brushings of 107 moderate-to-severe asthmatics were annotated by gene-set variation analysis (GSVA) using 42 gene-signatures relevant to asthma, inflammation and immune function. Topological data analysis (TDA) of clinical and histological data was used to derive clusters and the nearest shrunken centroid algorithm used for signature refinement. Results: 9 GSVA signatures expressed in bronchial biopsies and airway epithelial brushings distinguished two distinct asthma subtypes associated with high expression of T-helper type 2 (Th-2) cytokines and lack of corticosteroid response (Group 1 and Group 3). Group 1 had the highest submucosal eosinophils, high exhaled nitric oxide (FeNO) levels, exacerbation rates and oral corticosteroid (OCS) use whilst Group 3 patients showed the highest levels of sputum eosinophils and had a high BMI. In contrast, Group 2 and Group 4 patients had an 86% and 64% probability of having non-eosinophilic inflammation. Using machine-learning tools, we describe an inference scheme using the currently-available inflammatory biomarkers sputum eosinophilia and exhaled nitric oxide levels along with OCS use that could predict the subtypes of gene expression within bronchial biopsies and epithelial cells with good sensitivity and specificity. Conclusion: This analysis demonstrates the usefulness of a transcriptomic-driven approach to phenotyping that segments patients who may benefit the most from specific agents that target Th2-mediated inflammation and/or corticosteroid insensitivity.

Journal ArticleDOI
Josep M. Antó, Jean Bousquet1, Jean Bousquet2, Mübeccel Akdis3, Charles Auffray4, Thomas Keil5, Isabelle Momas6, Dirkje S. Postma7, Rudolf Valenta8, Magnus Wickman9, Anne Cambon-Thomsen10, Tari Haahtela11, Bart N. Lambrecht12, Karin C. Lødrup Carlsen13, Gerard H. Koppelman7, Jordi Sunyer, Torsten Zuberbier5, I. Annesi-Maesano14, Albert Arno, Carsten Bindslev-Jensen15, Giuseppe De Carlo, Francesco Forastiere, Joachim Heinrich, Marek L. Kowalski16, Dieter Maier17, Erik Melén9, Henriette A. Smit18, Marie Standl, John Wright19, Anna Asarnoj20, Marta Benet, Natalia Ballardini9, Natalia Ballardini21, Judith Garcia-Aymerich, Ulrike Gehring18, Stefano Guerra, Cynthia Hohmann5, Inger Kull9, Christian Lupinek8, Mariona Pinart, I. Skrindo13, Marit Westman20, Delphine Smagghe1, Cezmi A. Akdis3, Niklas Andersson9, Claus Bachert22, Stephane Ballereau4, Ferran Ballester23, Xavier Basagaña, Anna Bedbrook, Anna Bergström9, Andrea von Berg, Bert Brunekreef18, Emilie Burte1, Kai-Håkon Carlsen13, Leda Chatzi24, Jonathan M. Coquet12, Mirela Curin8, Pascal Demoly2, Esben Eller15, Maria Pia Fantini25, Leena von Hertzen11, Vergard Hovland13, Bénédicte Jacquemin, Jocelyne Just26, Theresa Keller5, Renata Kiss8, Manolis Kogevinas, Sibylle Koletzko27, Susanne Lau5, Irina Lehmann28, Nicolas Lemonnier, Mika J. Mäkelä11, Jordi Mestres29, Peter Mowinckel13, Rachel Nadif1, Martijn C. Nawijn7, Johan Pellet4, Isabelle Pin, Daniela Porta, Fanny Rancière6, Emmanuelle Rial-Sebbag10, Yvan Saeys12, Martijn J. Schuijs12, Valérie Siroux1, Christina Tischer, Mathies Torrent, Raphaëlle Varraso1, Kalus Wenzel17, Cheng-Jian Xu7 
TL;DR: The translational component of MeDALL is shown by the identification of a novel allergic phenotype characterized by polysensitization and multimorbidity, which is associated with the frequency, persistence, and severity of allergic symptoms.
Abstract: Asthma, rhinitis, and eczema are complex diseases with multiple genetic and environmental factors interlinked through IgE-associated and non-IgE-associated mechanisms. Mechanisms of the Development of ALLergy (MeDALL; EU FP7-CP-IP; project no: 261357; 2010-2015) studied the complex links of allergic diseases at the clinical and mechanistic levels by linking epidemiologic, clinical, and mechanistic research, including in vivo and in vitro models. MeDALL integrated 14 European birth cohorts, including 44,010 participants and 160 cohort follow-ups between pregnancy and age 20 years. Thirteen thousand children were prospectively followed after puberty by using a newly standardized MeDALL Core Questionnaire. A microarray developed for allergen molecules with increased IgE sensitivity was obtained for 3,292 children. Estimates of air pollution exposure from previous studies were available for 10,000 children. Omics data included those from historical genome-wide association studies (23,000 children) and DNA methylation (2,173), targeted multiplex biomarker (1,427), and transcriptomic (723) studies. Using classical epidemiology and machine-learning methods in 16,147 children aged 4 years and 11,080 children aged 8 years, MeDALL showed the multimorbidity of eczema, rhinitis, and asthma and estimated that only 38% of multimorbidity was attributable to IgE sensitization. MeDALL has proposed a new vision of multimorbidity independent of IgE sensitization, and has shown that monosensitization and polysensitization represent 2 distinct phenotypes. The translational component of MeDALL is shown by the identification of a novel allergic phenotype characterized by polysensitization and multimorbidity, which is associated with the frequency, persistence, and severity of allergic symptoms. The results of MeDALL will help integrate personalized, predictive, preventative, and participatory approaches in allergic diseases.

Journal ArticleDOI
TL;DR: Yasset Perez-Riverola,*, Mingze Baia,b,c,†, Felipe da Veiga Leprevostd, Silvano Squizzatoa, Young Mi Parka, Kenneth Hauga, Adam J. Carrolle, Dylan Spaldinga, Justin Paschalla, Mingxun Wangf, Noemi del-Toroa, Tobias Ternenta, Peng Zhangd,g, Nicola Busoa, Nuno Bandeiraf
Abstract: This work has been supported by the US NIH BD2K grant U54 GM114833 and a National Natural Science Foundation of China grant (61501071). A.I.N. is supported by US National Institute of Health grant (R01-GM-094231). Y.P.-R. is supported by BBSRC ‘PROCESS’ grant (BB/K01997X/1). M.B. is supported by Projects of International Cooperation and Exchanges grant (2014DFB30010). M.W. is supported by an NIH grant (5P41GM103484-07). J.A.V. and N.d.-T. are supported by the Wellcome Trust (grant WT101477MA). T.T. is supported by the BBSRC ‘ProteoGenomics’ grant (BB/L024225/1). E.W.D. and D.S.C. are supported in part by grant (U24 AI117966- 02S1). S.-A.S. is supported in part by US NIH BD2K grant (1U24AI117966-01). M.W. and N.Bandeira were supported by NIH grant (5P41GM103484-07). N.Bandeira was also partially supported as an Alfred P. Sloan Fellow. S.Subramaniam is supported by NIH grants U01 DK097430 and U01 CA198941

Journal ArticleDOI
TL;DR: The benefits of acarbose for T2DM may correlate with the selective modulation of the gut microbiota, as the microbiota may have a critical role in the development of metabolic diseases.
Abstract: Introduction The α-glucosidase inhibitor acarbose is an efficacious medicine for the treatment and prevention of type 2 diabetes mellitus (T2DM). However, the response of gut microbiota to acarbose is important, as the microbiota may have a critical role in the development of metabolic diseases, and acarbose is metabolized exclusively within the gastrointestinal tract. We explored the changes in the proportion and diversity of gut microbiota before and after treatment with acarbose in patients with prediabetes.

Journal ArticleDOI
TL;DR: The results indicated that gene‐environment interactions are important for asthma development and provided supportive evidence for interaction with air pollution for ADCY2, B4GALT5, and DLG2.
Abstract: Rationale: The evidence supporting an association between traffic-related air pollution exposure and incident childhood asthma is inconsistent and may depend on genetic factors.Objectives: To identify gene–environment interaction effects on childhood asthma using genome-wide single-nucleotide polymorphism (SNP) data and air pollution exposure. Identified loci were further analyzed at epigenetic and transcriptomic levels.Methods: We used land use regression models to estimate individual air pollution exposure (represented by outdoor NO2 levels) at the birth address and performed a genome-wide interaction study for doctors’ diagnoses of asthma up to 8 years in three European birth cohorts (n = 1,534) with look-up for interaction in two separate North American cohorts, CHS (Children’s Health Study) and CAPPS/SAGE (Canadian Asthma Primary Prevention Study/Study of Asthma, Genetics and Environment) (n = 1,602 and 186 subjects, respectively). We assessed expression quantitative trait locus effects in human lung...

Journal ArticleDOI
TL;DR: Results suggested that the altered miRNAs are involved in the core processes associated with T2DM, such as carbohydrate and lipid metabolisms, insulin signaling pathway and the adipocytokine signaling pathway.
Abstract: MicroRNAs (miRNAs) are small noncoding RNAs that modulate the cellular transcriptome at the post-transcriptional level. miRNA plays important roles in different disease manifestation, including type 2 diabetes mellitus (T2DM). Many studies have characterized the changes of miRNAs in T2DM, a complex systematic disease; however, few studies have integrated these findings and explored the functional effects of the dysregulated miRNAs identified. To investigate the involvement of miRNAs in T2DM, we obtained and analyzed all relevant studies published prior to 18 October 2016 from various literature databases. From 59 independent studies that met the inclusion criteria, we identified 158 dysregulated miRNAs in seven different major sample types. To understand the functional impact of these deregulated miRNAs, we performed targets prediction and pathway enrichment analysis. Results from our analysis suggested that the altered miRNAs are involved in the core processes associated with T2DM, such as carbohydrate and lipid metabolisms, insulin signaling pathway and the adipocytokine signaling pathway. This systematic survey of dysregulated miRNAs provides molecular insights on the effect of deregulated miRNAs in different tissues during the development of diabetes. Some of these miRNAs and their mRNA targets may have diagnostic and/or therapeutic utilities in T2DM.

Journal ArticleDOI
TL;DR: High‐throughput sequencing and genome mapping technologies are combined to generate a validated sequence map of the 20 Spirodela polyrhiza chromosomes and reveal a genome that has undergone reduction, likely through eliminating non‐essential protein coding genes, rDNA and LTRs.
Abstract: Spirodela polyrhiza is a fast-growing aquatic monocot with highly reduced morphology, genome size and number of protein-coding genes. Considering these biological features of Spirodela and its basal position in the monocot lineage, understanding its genome architecture could shed light on plant adaptation and genome evolution. Like many draft genomes, however, the 158-Mb Spirodela genome sequence has not been resolved to chromosomes, and important genome characteristics have not been defined. Here we deployed rapid genome-wide physical maps combined with high-coverage short-read sequencing to resolve the 20 chromosomes of Spirodela and to empirically delineate its genome features. Our data revealed a dramatic reduction in the number of the rDNA repeat units in Spirodela to fewer than 100, which is even fewer than that reported for yeast. Consistent with its unique phylogenetic position, small RNA sequencing revealed 29 Spirodela-specific microRNA, with only two being shared with Elaeis guineensis (oil palm) and Musa balbisiana (banana). Combining DNA methylation data and small RNA sequencing enabled the accurate prediction of 20.5% long terminal repeats (LTRs) that doubled the previous estimate, and revealed a high Solo:Intact LTR ratio of 8.2. Interestingly, we found that Spirodela has the lowest global DNA methylation levels (9%) of any plant species tested. Taken together our results reveal a genome that has undergone reduction, likely through eliminating non-essential protein coding genes, rDNA and LTRs. In addition to delineating the genome features of this unique plant, the methodologies described and large-scale genome resources from this work will enable future evolutionary and functional studies of this basal monocot family.

Journal ArticleDOI
TL;DR: Crops in silico (cropsinsilico.org), an integrative and multi-scale modeling platform, is introduced as one solution that combines isolated modeling efforts toward the generation of virtual crops, which is open and accessible to the entire plant biology community.
Abstract: Multi-scale models can facilitate whole plant simulations by linking gene networks, protein synthesis, metabolic pathways, physiology, and growth. Whole plant models can be further integrated with ecosystem, weather, and climate models to predict how various interactions respond to environmental perturbations. These models have the potential to fill in missing mechanistic details and generate new hypotheses to prioritize directed engineering efforts. Outcomes will potentially accelerate improvement of crop yield, sustainability, and increase future food security. It is time for a paradigm shift in plant modeling, from largely isolated efforts to a connected community that takes advantage of advances in high performance computing and mechanistic understanding of plant processes. Tools for guiding future crop breeding and engineering, understanding the implications of discoveries at the molecular level for whole plant behavior, and improved prediction of plant and ecosystem responses to the environment are urgently needed. The purpose of this perspective is to introduce Crops in silico (cropsinsilico.org), an integrative and multi-scale modeling platform, as one solution that combines isolated modeling efforts toward the generation of virtual crops, which is open and accessible to the entire plant biology community. The major challenges involved both in the development and deployment of a shared, multi-scale modeling platform, which are summarized in this prospectus, were recently identified during the first Crops in silico Symposium and Workshop.

Journal ArticleDOI
TL;DR: The international Testicular Cancer Consortium (TECAC) combined five published genome-wide association studies of testicular germ cell tumor to identify new susceptibility loci, including the first analysis of the X chromosome, which substantially increase the number of known TGCT susceptibility alleles, move the field closer to a comprehensive understanding of the underlying genetic architecture of TGCT, and provide further clues to the etiology ofTGCT.
Abstract: Katherine Nathanson, Peter Kanetsky and colleagues present a meta-analysis of five genome-wide association studies of testicular germ cell tumor (TGCT). They identify eight new susceptibility loci and new independent signals at two previously reported loci, providing further clues to the etiology of TGCT.

Journal ArticleDOI
TL;DR: The Proteomics Standards Initiative is reviewed, synergies with other efforts such as the ProteomeXchange Consortium, the Human proteome Project, and the metabolomics community are described, and a look at future directions of the PSI are provided.
Abstract: The Proteomics Standards Initiative (PSI) of the Human Proteome Organization (HUPO) has now been developing and promoting open community standards and software tools in the field of proteomics for 15 years. Under the guidance of the chair, cochairs, and other leadership positions, the PSI working groups are tasked with the development and maintenance of community standards via special workshops and ongoing work. Among the existing ratified standards, the PSI working groups continue to update PSI-MI XML, MITAB, mzML, mzIdentML, mzQuantML, mzTab, and the MIAPE (Minimum Information About a Proteomics Experiment) guidelines with the advance of new technologies and techniques. Furthermore, new standards are currently either in the final stages of completion (proBed and proBAM for proteogenomics results as well as PEFF) or in early stages of design (a spectral library standard format, a universal spectrum identifier, the qcML quality control format, and the Protein Expression Interface (PROXI) web services Application Programming Interface). In this work we review the current status of all of these aspects of the PSI, describe synergies with other efforts such as the ProteomeXchange Consortium, the Human Proteome Project, and the metabolomics community, and provide a look at future directions of the PSI.

Journal ArticleDOI
TL;DR: A comprehensive systems analysis of the infected brain identified associations between the parasite-brain interactions and epilepsy, movement disorders, Alzheimer’s disease, and cancer.
Abstract: One third of humans are infected lifelong with the brain-dwelling, protozoan parasite, Toxoplasma gondii. Approximately fifteen million of these have congenital toxoplasmosis. Although neurobehavioral disease is associated with seropositivity, causality is unproven. To better understand what this parasite does to human brains, we performed a comprehensive systems analysis of the infected brain: We identified susceptibility genes for congenital toxoplasmosis in our cohort of infected humans and found these genes are expressed in human brain. Transcriptomic and quantitative proteomic analyses of infected human, primary, neuronal stem and monocytic cells revealed effects on neurodevelopment and plasticity in neural, immune, and endocrine networks. These findings were supported by identification of protein and miRNA biomarkers in sera of ill children reflecting brain damage and T. gondii infection. These data were deconvoluted using three systems biology approaches: “Orbital-deconvolution” elucidated upstream, regulatory pathways interconnecting human susceptibility genes, biomarkers, proteomes, and transcriptomes. “Cluster-deconvolution” revealed visual protein-protein interaction clusters involved in processes affecting brain functions and circuitry, including lipid metabolism, leukocyte migration and olfaction. Finally, “disease-deconvolution” identified associations between the parasite-brain interactions and epilepsy, movement disorders, Alzheimer’s disease, and cancer. This “reconstruction-deconvolution” logic provides templates of progenitor cells’ potentiating effects, and components affecting human brain parasitism and diseases.

Journal ArticleDOI
TL;DR: A map of a cell-cell communication network is constructed that indicates what signal is exchanged between which cell types in the tumor and may establish a new formal phenotype of cancer which captures the cell- cell communication structure.
Abstract: Many behaviors of cancer, such as progression, metastasis and drug resistance etc., cannot be fully understood by genetic mutations or intracellular signaling alone. Instead, they are emergent properties of the cell community which forms a tumor. Studies of tumor heterogeneity reveal that many cancer behaviors critically depend on intercellular communication between cancer cells themselves and between cancer-stromal cells by secreted signaling molecules (ligands) and their cognate receptors. We analyzed public cancer transcriptome database for changes in cell-cell interactions as the characteristic of malignancy. We curated a list (>2,500 ligand-receptor pairs) and identified their joint enrichment in tumors from TCGA pan-cancer data. From single-cell RNA-Seq data for a case of melanoma and the specificity of the ligand-receptor interactions and their gene expression measured in individual cells, we constructed a map of a cell-cell communication network that indicates what signal is exchanged between which cell types in the tumor. Such networks establish a new formal phenotype of cancer which captures the cell-cell communication structure - it may open new opportunities for identifying molecular signatures of coordinated behaviors of cancer cells as a population - in turn may become a determinant of cancer progression potential and prognosis.

Proceedings Article
01 Jan 2017
TL;DR: A novel generative model with two closely correlated parts, one for communities and the other for semantics is introduced, which is not only robust for finding communities and semantics, but also able to provide more than one semantic explanation to a community.
Abstract: The objective of discovering network communities, an essential step in complex systems analysis, is two-fold: identification of functional modules and their semantics at the same time. However, most existing community-finding methods have focused on finding communities using network topologies, and the problem of extracting module semantics has not been well studied and node contents, which often contain semantic information of nodes and networks, have not been fully utilized. We considered the problem of identifying network communities and module semantics at the same time. We introduced a novel generative model with two closely correlated parts, one for communities and the other for semantics. We developed a co-learning strategy to jointly train the two parts of the model by combining a nested EM algorithm and belief propagation. By extracting the latent correlation between the two parts, our new method is not only robust for finding communities and semantics, but also able to provide more than one semantic explanation to a community. We evaluated the new method on artificial benchmarks and analyzed the semantic interpretability by a case study. We compared the new method with eight stateof-the-art methods on ten real-world networks, showing its superior performance over the existing methods.

Journal ArticleDOI
TL;DR: By quantifying the cell population structure during a critical state transition, key regulators of lineages commitment are identified and the percentage of desired cell types for several protocol variations are predicted 2 wk in advance, affording a tool to forecast cell fate outcomes and can be used to optimize differentiation protocols to obtain desired cell populations.
Abstract: Steering the differentiation of induced pluripotent stem cells (iPSCs) toward specific cell types is crucial for patient-specific disease modeling and drug testing. This effort requires the capacity to predict and control when and how multipotent progenitor cells commit to the desired cell fate. Cell fate commitment represents a critical state transition or "tipping point" at which complex systems undergo a sudden qualitative shift. To characterize such transitions during iPSC to cardiomyocyte differentiation, we analyzed the gene expression patterns of 96 developmental genes at single-cell resolution. We identified a bifurcation event early in the trajectory when a primitive streak-like cell population segregated into the mesodermal and endodermal lineages. Before this branching point, we could detect the signature of an imminent critical transition: increase in cell heterogeneity and coordination of gene expression. Correlation analysis of gene expression profiles at the tipping point indicates transcription factors that drive the state transition toward each alternative cell fate and their relationships with specific phenotypic readouts. The latter helps us to facilitate small molecule screening for differentiation efficiency. To this end, we set up an analysis of cell population structure at the tipping point after systematic variation of the protocol to bias the differentiation toward mesodermal or endodermal cell lineage. We were able to predict the proportion of cardiomyocytes many days before cells manifest the differentiated phenotype. The analysis of cell populations undergoing a critical state transition thus affords a tool to forecast cell fate outcomes and can be used to optimize differentiation protocols to obtain desired cell populations.

Journal ArticleDOI
TL;DR: It is shown that both activation and repression target genes can be bound by Bcl11b in vivo, and that B cl11b effects overlap with E2A-dependent effects, resolving how innate lymphoid, myeloid, and dendritic, and B-cell fate alternatives are excluded by different mechanisms.
Abstract: T-cell development from hematopoietic progenitors depends on multiple transcription factors, mobilized and modulated by intrathymic Notch signaling. Key aspects of T-cell specification network architecture have been illuminated through recent reports defining roles of transcription factors PU.1, GATA-3, and E2A, their interactions with Notch signaling, and roles of Runx1, TCF-1, and Hes1, providing bases for a comprehensively updated model of the T-cell specification gene regulatory network presented herein. However, the role of lineage commitment factor Bcl11b has been unclear. We use self-organizing maps on 63 RNA-seq datasets from normal and perturbed T-cell development to identify functional targets of Bcl11b during commitment and relate them to other regulomes. We show that both activation and repression target genes can be bound by Bcl11b in vivo, and that Bcl11b effects overlap with E2A-dependent effects. The newly clarified role of Bcl11b distinguishes discrete components of commitment, resolving how innate lymphoid, myeloid, and dendritic, and B-cell fate alternatives are excluded by different mechanisms.

Journal ArticleDOI
TL;DR: A comprehensive and customizable sRNA-Seq data analysis pipeline—sRNAnalyzer is built, which enables comprehensive miRNA profiling strategies to better handle isomiRs and summarization based on each nucleotide position to detect potential SNPs in miRNAs.
Abstract: Although many tools have been developed to analyze small RNA sequencing (sRNA-Seq) data, it remains challenging to accurately analyze the small RNA population, mainly due to multiple sequence ID assignment caused by short read length. Additional issues in small RNA analysis include low consistency of microRNA (miRNA) measurement results across different platforms, miRNA mapping associated with miRNA sequence variation (isomiR) and RNA editing, and the origin of those unmapped reads after screening against all endogenous reference sequence databases. To address these issues, we built a comprehensive and customizable sRNA-Seq data analysis pipeline-sRNAnalyzer, which enables: (i) comprehensive miRNA profiling strategies to better handle isomiRs and summarization based on each nucleotide position to detect potential SNPs in miRNAs, (ii) different sequence mapping result assignment approaches to simulate results from microarray/qRT-PCR platforms and a local probabilistic model to assign mapping results to the most-likely IDs, (iii) comprehensive ribosomal RNA filtering for accurate mapping of exogenous RNAs and summarization based on taxonomy annotation. We evaluated our pipeline on both artificial samples (including synthetic miRNA and Escherichia coli cultures) and biological samples (human tissue and plasma). sRNAnalyzer is implemented in Perl and available at: http://srnanalyzer.systemsbiology.net/.

Journal ArticleDOI
TL;DR: RNA-seq analysis suggests that both the recruitment and remodeling functions of Snf5 are required in vivo for SWI/SNF regulation of gene expression, and loss of SNF5 alters the structure and function of SWI-SNF.

Journal ArticleDOI
TL;DR: A detailed history of demography and natural selection of this population of Tibetans is inferred and evidence of population structure between the ancestral Han and Tibetan subpopulations as early as 44 to 58 thousand years ago, but with high rates of gene flow until approximately 9 thousands years ago.
Abstract: The indigenous people of the Tibetan Plateau have been the subject of much recent interest because of their unique genetic adaptations to high altitude. Recent studies have demonstrated that the Tibetan EPAS1 haplotype is involved in high altitude-adaptation and originated in an archaic Denisovan-related population. We sequenced the whole-genomes of 27 Tibetans and conducted analyses to infer a detailed history of demography and natural selection of this population. We detected evidence of population structure between the ancestral Han and Tibetan subpopulations as early as 44 to 58 thousand years ago, but with high rates of gene flow until approximately 9 thousand years ago. The CMS test ranked EPAS1 and EGLN1 as the top two positive selection candidates, and in addition identified PTGIS, VDR, and KCTD12 as new candidate genes. The advantageous Tibetan EPAS1 haplotype shared many variants with the Denisovan genome, with an ancient gene tree divergence between the Tibetan and Denisovan haplotypes of about 1 million years ago. With the exception of EPAS1, we observed no evidence of positive selection on Denisovan-like haplotypes.