scispace - formally typeset
Search or ask a question

Showing papers in "Molecular Systems Biology in 2020"


Journal ArticleDOI
TL;DR: The expression pattern of ACE2 across > 150 different cell types corresponding to all major human tissues and organs based on stringent immunohistochemical analysis constitutes an important resource for further studies on SARS‐CoV‐2 host cell entry, to understand the biology of the disease and to aid in the development of effective treatments to the viral infection.
Abstract: The novel SARS-coronavirus 2 (SARS-CoV-2) poses a global challenge on healthcare and society. For understanding the susceptibility for SARS-CoV-2 infection, the cell type-specific expression of the host cell surface receptor is necessary. The key protein suggested to be involved in host cell entry is angiotensin I converting enzyme 2 (ACE2). Here, we report the expression pattern of ACE2 across > 150 different cell types corresponding to all major human tissues and organs based on stringent immunohistochemical analysis. The results were compared with several datasets both on the mRNA and protein level. ACE2 expression was mainly observed in enterocytes, renal tubules, gallbladder, cardiomyocytes, male reproductive cells, placental trophoblasts, ductal cells, eye, and vasculature. In the respiratory system, the expression was limited, with no or only low expression in a subset of cells in a few individuals, observed by one antibody only. Our data constitute an important resource for further studies on SARS-CoV-2 host cell entry, in order to understand the biology of the disease and to aid in the development of effective treatments to the viral infection.

642 citations


Journal ArticleDOI
Sarah M. Keating1, Sarah M. Keating2, Dagmar Waltemath3, Matthias König4, Fengkai Zhang5, Andreas Dräger6, Claudine Chaouiya7, Claudine Chaouiya8, Frank Bergmann1, Andrew Finney9, Colin S. Gillespie10, Tomáš Helikar11, Stefan Hoops12, Rahuman S Malik-Sheriff, Stuart L. Moodie, Ion I. Moraru13, Chris J. Myers14, Aurélien Naldi15, Brett G. Olivier2, Brett G. Olivier16, Brett G. Olivier1, Sven Sahle1, James C. Schaff, Lucian P. Smith17, Lucian P. Smith2, Maciej J. Swat, Denis Thieffry15, Leandro Watanabe14, Darren J. Wilkinson18, Darren J. Wilkinson10, Michael L. Blinov13, Kimberly Begley2, James R. Faeder19, Harold F. Gómez20, Thomas M. Hamm6, Yuichiro Inagaki, Wolfram Liebermeister21, Allyson L. Lister22, Daniel Lucio23, Eric Mjolsness24, Carole J. Proctor10, Karthik Raman25, Nicolas Rodriguez26, Clifford A. Shaffer27, Bruce E. Shapiro28, Joerg Stelling20, Neil Swainston29, Naoki Tanimura, John Wagner30, Martin Meier-Schellersheim5, Herbert M. Sauro17, Bernhard O. Palsson31, Hamid Bolouri32, Hiroaki Kitano33, Akira Funahashi34, Henning Hermjakob, John Doyle2, Michael Hucka2, Richard R. Adams, Nicholas Alexander Allen35, Bastian R. Angermann5, Marco Antoniotti36, Gary D. Bader37, Jan Červený38, Mélanie Courtot, Christopher Cox39, Piero Dalle Pezze26, Emek Demir40, William S. Denney, Harish Dharuri41, Julien Dorier, Dirk Drasdo, Ali Ebrahim31, Johannes Eichner, Johan Elf42, Lukas Endler, Chris T. Evelo43, Christoph Flamm44, Ronan M. T. Fleming45, Martina Fröhlich, Mihai Glont, Emanuel Gonçalves46, Martin Golebiewski47, Hovakim Grabski48, Alex Gutteridge, Damon Hachmeister, Leonard A. Harris, Benjamin D. Heavner, Ron Henkel, William S. Hlavacek2, Bin Hu49, Daniel R. Hyduke50, Hidde de Jong, Nick Juty46, Peter D. Karp, Jonathan R. Karr51, Douglas B. Kell52, Roland Keller6, Ilya Kiselev53, Steffen Klamt54, Edda Klipp54, Christian Knüpfer55, Fedor A. Kolpakov, Falko Krause4, Martina Kutmon, Camille Laibe46, Conor Lawless7, Lu Li56, Leslie M. Loew10, Rainer Machné27, Yukiko Matsuoka, Pedro Mendes, Huaiyu Mi57, Florian Mittag1, Pedro T. Monteiro7, Kedar Nath Natarajan, Poul M. F. Nielsen17, Tramy Nguyen, Alida Palmisano58, Jean-Baptiste Pettit14, Thomas Pfau10, Robert Phair13, Tomas Radivoyevitch2, Johann M. Rohwer59, Oliver A. Ruebenacker60, Julio Saez-Rodriguez6, Martin Scharm61, Henning Schmidt47, Falk Schreiber48, Michael Schubert, Roman Schulte24, Stuart C. Sealfon10, Kieran Smallbone, Sylvain Soliman, Melanie I. Stefan2, Devin P. Sullivan28, Koichi Takahashi50, Bas Teusink, David Tolnay2, Ibrahim Vazirabad30, Axel von Kamp54, Ulrike Wittig52, Clemens Wrzodek6, Finja Wrzodek6, Ioannis Xenarios, Anna Zhukova, Jeremy Zucker62 
Heidelberg University1, California Institute of Technology2, University of Greifswald3, Humboldt University of Berlin4, National Institutes of Health5, University of Tübingen6, Instituto Gulbenkian de Ciência7, Aix-Marseille University8, Ansys9, Newcastle University10, University of Nebraska–Lincoln11, University of Virginia12, University of Connecticut13, University of Utah14, PSL Research University15, VU University Amsterdam16, University of Washington17, The Turing Institute18, University of Pittsburgh19, ETH Zurich20, Université Paris-Saclay21, University of Oxford22, North Carolina State University23, University of California, Irvine24, Indian Institute of Technology Madras25, Babraham Institute26, Virginia Tech27, California State University, Northridge28, University of Liverpool29, IBM30, University of California, San Diego31, Virginia Mason Medical Center32, Okinawa Institute of Science and Technology33, Keio University34, Amazon.com35, University of Milan36, University of Toronto37, Masaryk University38, University of Tennessee39, Oregon Health & Science University40, Illumina41, Uppsala University42, Maastricht University43, Alpen-Adria-Universität Klagenfurt44, Medical University of Vienna45, European Bioinformatics Institute46, University of Rostock47, Leibniz Association48, Lorentz Institute49, Shinshu University50, Icahn School of Medicine at Mount Sinai51, Heidelberg Institute for Theoretical Studies52, Greifswald University Hospital53, Max Planck Society54, University of Jena55, École Polytechnique56, University of Southern California57, École Normale Supérieure58, Stellenbosch University59, École Polytechnique Fédérale de Lausanne60, Mizuho Information & Research Institute61, Pacific Northwest National Laboratory62
TL;DR: The latest edition of the Systems Biology Markup Language (SBML) is reviewed, a format designed for this purpose that leverages two decades of SBML and a rich software ecosystem that transformed how systems biologists build and interact with models.
Abstract: Systems biology has experienced dramatic growth in the number, size, and complexity of computational models. To reproduce simulation results and reuse models, researchers must exchange unambiguous model descriptions. We review the latest edition of the Systems Biology Markup Language (SBML), a format designed for this purpose. A community of modelers and software authors developed SBML Level 3 over the past decade. Its modular form consists of a core suited to representing reaction-based models and packages that extend the core with features suited to other model types including constraint-based models, reaction-diffusion models, logical network models, and rule-based models. The format leverages two decades of SBML and a rich software ecosystem that transformed how systems biologists build and interact with models. More recently, the rise of multiscale models of whole cells and organs, and new data sources such as single-cell measurements and live imaging, has precipitated new ways of integrating data with models. We provide our perspectives on the challenges presented by these developments and how SBML Level 3 provides the foundation needed to support this evolution.

176 citations


Journal ArticleDOI
TL;DR: A highly reproducible mass spectrometry (MS)‐based proteomics workflow for the in‐depth analysis of CSF from minimal sample amounts is presented and a consistent glycolytic signature across cohorts and a recent study suggests clinical utility of this proteomic signature.
Abstract: Neurodegenerative diseases are a growing burden, and there is an urgent need for better biomarkers for diagnosis, prognosis, and treatment efficacy. Structural and functional brain alterations are reflected in the protein composition of cerebrospinal fluid (CSF). Alzheimer's disease (AD) patients have higher CSF levels of tau, but we lack knowledge of systems-wide changes of CSF protein levels that accompany AD. Here, we present a highly reproducible mass spectrometry (MS)-based proteomics workflow for the in-depth analysis of CSF from minimal sample amounts. From three independent studies (197 individuals), we characterize differences in proteins by AD status (> 1,000 proteins, CV < 20%). Proteins with previous links to neurodegeneration such as tau, SOD1, and PARK7 differed most strongly by AD status, providing strong positive controls for our approach. CSF proteome changes in Alzheimer's disease prove to be widespread and often correlated with tau concentrations. Our unbiased screen also reveals a consistent glycolytic signature across our cohorts and a recent study. Machine learning suggests clinical utility of this proteomic signature.

124 citations


Journal ArticleDOI
TL;DR: Thermal proteome profiling provides a unique insight into protein state and interactions in their native context and at a proteome‐wide level, allowing to study basic biological processes and their underlying mechanisms.
Abstract: Thermal proteome profiling (TPP) is based on the principle that, when subjected to heat, proteins denature and become insoluble. Proteins can change their thermal stability upon interactions with small molecules (such as drugs or metabolites), nucleic acids or other proteins, or upon post-translational modifications. TPP uses multiplexed quantitative mass spectrometry-based proteomics to monitor the melting profile of thousands of expressed proteins. Importantly, this approach can be performed in vitro, in situ, or in vivo. It has been successfully applied to identify targets and off-targets of drugs, or to study protein-metabolite and protein-protein interactions. Therefore, TPP provides a unique insight into protein state and interactions in their native context and at a proteome-wide level, allowing to study basic biological processes and their underlying mechanisms.

117 citations


Journal ArticleDOI
TL;DR: A new metabolic network reconstruction approach that used organ‐specific information from literature and omics data to generate two sex‐specific whole‐body metabolic (WBM) reconstructions that capture the metabolism of 26 organs and six blood cell types is developed.
Abstract: Comprehensive molecular-level models of human metabolism have been generated on a cellular level. However, models of whole-body metabolism have not been established as they require new methodological approaches to integrate molecular and physiological data. We developed a new metabolic network reconstruction approach that used organ-specific information from literature and omics data to generate two sex-specific whole-body metabolic (WBM) reconstructions. These reconstructions capture the metabolism of 26 organs and six blood cell types. Each WBM reconstruction represents whole-body organ-resolved metabolism with over 80,000 biochemical reactions in an anatomically and physiologically consistent manner. We parameterized the WBM reconstructions with physiological, dietary, and metabolomic data. The resulting WBM models could recapitulate known inter-organ metabolic cycles and energy use. We also illustrate that the WBM models can predict known biomarkers of inherited metabolic diseases in different biofluids. Predictions of basal metabolic rates, by WBM models personalized with physiological data, outperformed current phenomenological models. Finally, integrating microbiome data allowed the exploration of host-microbiome co-metabolism. Overall, the WBM reconstructions, and their derived computational models, represent an important step toward virtual physiological humans.

103 citations


Journal ArticleDOI
TL;DR: It is shown that the magnitude of early life decline in proteasome levels is a major risk factor for mortality and causative events in the aging process that can be targeted to prevent loss of protein homeostasis and delay the onset of age‐related neurodegeneration are defined.
Abstract: A progressive loss of protein homeostasis is characteristic of aging and a driver of neurodegeneration To investigate this process quantitatively, we characterized proteome dynamics during brain aging in the short-lived vertebrate Nothobranchius furzeri combining transcriptomics and proteomics We detected a progressive reduction in the correlation between protein and mRNA, mainly due to post-transcriptional mechanisms that account for over 40% of the age-regulated proteins These changes cause a progressive loss of stoichiometry in several protein complexes, including ribosomes, which show impaired assembly/disassembly and are enriched in protein aggregates in old brains Mechanistically, we show that reduction of proteasome activity is an early event during brain aging and is sufficient to induce proteomic signatures of aging and loss of stoichiometry in vivo Using longitudinal transcriptomic data, we show that the magnitude of early life decline in proteasome levels is a major risk factor for mortality Our work defines causative events in the aging process that can be targeted to prevent loss of protein homeostasis and delay the onset of age-related neurodegeneration

95 citations


Journal ArticleDOI
TL;DR: This work implemented single‐pot solid‐phase‐enhanced sample preparation on a liquid handling robot for automated processing (autoSP3) of tissue lysates in a 96‐well format, enabling reproducible tissue proteomics in a broad range of clinical and non‐clinical applications.
Abstract: High-throughput and streamlined workflows are essential in clinical proteomics for standardized processing of samples from a variety of sources, including fresh-frozen tissue, FFPE tissue, or blood. To reach this goal, we have implemented single-pot solid-phase-enhanced sample preparation (SP3) on a liquid handling robot for automated processing (autoSP3) of tissue lysates in a 96-well format. AutoSP3 performs unbiased protein purification and digestion, and delivers peptides that can be directly analyzed by LCMS, thereby significantly reducing hands-on time, reducing variability in protein quantification, and improving longitudinal reproducibility. We demonstrate the distinguishing ability of autoSP3 to process low-input samples, reproducibly quantifying 500-1,000 proteins from 100 to 1,000 cells. Furthermore, we applied this approach to a cohort of clinical FFPE pulmonary adenocarcinoma (ADC) samples and recapitulated their separation into known histological growth patterns. Finally, we integrated autoSP3 with AFA ultrasonication for the automated end-to-end sample preparation and LCMS analysis of 96 intact tissue samples. Collectively, this constitutes a generic, scalable, and cost-effective workflow with minimal manual intervention, enabling reproducible tissue proteomics in a broad range of clinical and non-clinical applications.

93 citations


Journal ArticleDOI
TL;DR: DeepSequence clearly stood out, showing both the strongest correlations with DMS data and having the best ability to predict pathogenic mutations, which is especially remarkable given that it is an unsupervised method.
Abstract: To deal with the huge number of novel protein-coding variants identified by genome and exome sequencing studies, many computational variant effect predictors (VEPs) have been developed. Such predictors are often trained and evaluated using different variant data sets, making a direct comparison between VEPs difficult. In this study, we use 31 previously published deep mutational scanning (DMS) experiments, which provide quantitative, independent phenotypic measurements for large numbers of single amino acid substitutions, in order to benchmark and compare 46 different VEPs. We also evaluate the ability of DMS measurements and VEPs to discriminate between pathogenic and benign missense variants. We find that DMS experiments tend to be superior to the top-ranking predictors, demonstrating the tremendous potential of DMS for identifying novel human disease mutations. Among the VEPs, DeepSequence clearly stood out, showing both the strongest correlations with DMS data and having the best ability to predict pathogenic mutations, which is especially remarkable given that it is an unsupervised method. We further recommend SNAP2, DEOGEN2, SNPs&GO, SuSPect and REVEL based upon their performance in these analyses.

86 citations


Journal ArticleDOI
TL;DR: It is demonstrated that the majority of expression variability results from cell state differences and that the contribution of transcriptional bursting is relatively minimal, which is effectively at the Poisson limit for most genes.
Abstract: Gene expression variability in mammalian systems plays an important role in physiological and pathophysiological conditions. This variability can come from differential regulation related to cell state (extrinsic) and allele-specific transcriptional bursting (intrinsic). Yet, the relative contribution of these two distinct sources is unknown. Here, we exploit the qualitative difference in the patterns of covariance between these two sources to quantify their relative contributions to expression variance in mammalian cells. Using multiplexed error robust RNA fluorescent in situ hybridization (MERFISH), we measured the multivariate gene expression distribution of 150 genes related to Ca2+ signaling coupled with the dynamic Ca2+ response of live cells to ATP. We show that after controlling for cellular phenotypic states such as size, cell cycle stage, and Ca2+ response to ATP, the remaining variability is effectively at the Poisson limit for most genes. These findings demonstrate that the majority of expression variability results from cell state differences and that the contribution of transcriptional bursting is relatively minimal.

82 citations


Journal ArticleDOI
TL;DR: It is found that stromal cells exhibit recurring, patient‐independent expression programs, and a ligand–receptor map that highlights recurring tumor–stroma interactions is reconstructed that provides a resource for understanding human liver malignancies.
Abstract: Malignant cell growth is fueled by interactions between tumor cells and the stromal cells composing the tumor microenvironment. The human liver is a major site of tumors and metastases, but molecular identities and intercellular interactions of different cell types have not been resolved in these pathologies. Here, we apply single cell RNA-sequencing and spatial analysis of malignant and adjacent non-malignant liver tissues from five patients with cholangiocarcinoma or liver metastases. We find that stromal cells exhibit recurring, patient-independent expression programs, and reconstruct a ligand-receptor map that highlights recurring tumor-stroma interactions. By combining transcriptomics of laser-capture microdissected regions, we reconstruct a zonation atlas of hepatocytes in the non-malignant sites and characterize the spatial distribution of each cell type across the tumor microenvironment. Our analysis provides a resource for understanding human liver malignancies and may expose potential points of interventions.

73 citations


Journal ArticleDOI
TL;DR: ScClassify, a multiscale classification framework based on ensemble learning and cell type hierarchies constructed from single or multiple annotated datasets as references, enables the estimation of sample size required for accurate classification of cell types in a cell type hierarchy and allows joint classification of cells when multiple references are available.
Abstract: Automated cell type identification is a key computational challenge in single-cell RNA-sequencing (scRNA-seq) data. To capitalise on the large collection of well-annotated scRNA-seq datasets, we developed scClassify, a multiscale classification framework based on ensemble learning and cell type hierarchies constructed from single or multiple annotated datasets as references. scClassify enables the estimation of sample size required for accurate classification of cell types in a cell type hierarchy and allows joint classification of cells when multiple references are available. We show that scClassify consistently performs better than other supervised cell type classification methods across 114 pairs of reference and testing data, representing a diverse combination of sizes, technologies and levels of complexity, and further demonstrate the unique components of scClassify through simulations and compendia of experimental datasets. Finally, we demonstrate the scalability of scClassify on large single-cell atlases and highlight a novel application of identifying subpopulations of cells from the Tabula Muris data that were unidentified in the original publication. Together, scClassify represents state-of-the-art methodology in automated cell type identification from scRNA-seq data.

Journal ArticleDOI
TL;DR: It is found that models that treat antigens as categorical outcome variables outperform those that model the TCR and antigen sequence jointly and that variability in single‐cell immune repertoire screens can be mitigated by modeling cell‐specific covariates.
Abstract: It has recently become possible to simultaneously assay T-cell specificity with respect to large sets of antigens and the T-cell receptor sequence in high-throughput single-cell experiments. Leveraging this new type of data, we propose and benchmark a collection of deep learning architectures to model T-cell specificity in single cells. In agreement with previous results, we found that models that treat antigens as categorical outcome variables outperform those that model the TCR and antigen sequence jointly. Moreover, we show that variability in single-cell immune repertoire screens can be mitigated by modeling cell-specific covariates. Lastly, we demonstrate that the number of bound pMHC complexes can be predicted in a continuous fashion providing a gateway to disentangle cell-to-dextramer binding strength and receptor-to-pMHC affinity. We provide these models in the Python package TcellMatch to allow imputation of antigen specificities in single-cell RNA-seq studies on T cells without the need for MHC staining.

Journal ArticleDOI
TL;DR: This study provides the first proteome‐wide analysis of intrinsic protein disorder for the human nucleolus and shows that nucleolar proteins in general, and mitotic chromosome proteins in particular, have significantly higher intrinsic disorder level compared to cytosolic proteins.
Abstract: The nucleolus is essential for ribosome biogenesis and is involved in many other cellular functions. We performed a systematic spatiotemporal dissection of the human nucleolar proteome using confocal microscopy. In total, 1,318 nucleolar proteins were identified; 287 were localized to fibrillar components, and 157 were enriched along the nucleoplasmic border, indicating a potential fourth nucleolar subcompartment: the nucleoli rim. We found 65 nucleolar proteins (36 uncharacterized) to relocate to the chromosomal periphery during mitosis. Interestingly, we observed temporal partitioning into two recruitment phenotypes: early (prometaphase) and late (after metaphase), suggesting phase-specific functions. We further show that the expression of MKI67 is critical for this temporal partitioning. We provide the first proteome-wide analysis of intrinsic protein disorder for the human nucleolus and show that nucleolar proteins in general, and mitotic chromosome proteins in particular, have significantly higher intrinsic disorder level compared to cytosolic proteins. In summary, this study provides a comprehensive and essential resource of spatiotemporal expression data for the nucleolar proteome as part of the Human Protein Atlas.

Journal ArticleDOI
TL;DR: A mathematical model for the HPA axis is developed that shows the property of dynamical compensation, where gland masses adjust over weeks to buffer variation in physiological parameters, and suggests that gland‐mass dynamics may play an important role in the pathophysiology of stress‐related disorders.
Abstract: Stress activates a complex network of hormones known as the hypothalamic-pituitary-adrenal (HPA) axis. The HPA axis is dysregulated in chronic stress and psychiatric disorders, but the origin of this dysregulation is unclear and cannot be explained by current HPA models. To address this, we developed a mathematical model for the HPA axis that incorporates changes in the total functional mass of the HPA hormone-secreting glands. The mass changes are caused by HPA hormones which act as growth factors for the glands in the axis. We find that the HPA axis shows the property of dynamical compensation, where gland masses adjust over weeks to buffer variation in physiological parameters. These mass changes explain the experimental findings on dysregulation of cortisol and ACTH dynamics in alcoholism, anorexia, and postpartum. Dysregulation occurs for a wide range of parameters and is exacerbated by impaired glucocorticoid receptor (GR) feedback, providing an explanation for the implication of GR in mood disorders. These findings suggest that gland-mass dynamics may play an important role in the pathophysiology of stress-related disorders.

Journal ArticleDOI
TL;DR: Linking drug and gene dependency together with genomic data sets uncovered contexts in which molecular networks when perturbed mediate cancer cell loss‐of‐fitness and thereby provide independent and orthogonal evidence of biomarkers for drug development.
Abstract: Low success rates during drug development are due, in part, to the difficulty of defining drug mechanism-of-action and molecular markers of therapeutic activity Here, we integrated 199,219 drug sensitivity measurements for 397 unique anti-cancer drugs with genome-wide CRISPR loss-of-function screens in 484 cell lines to systematically investigate cellular drug mechanism-of-action We observed an enrichment for positive associations between the profile of drug sensitivity and knockout of a drug's nominal target, and by leveraging protein-protein networks, we identified pathways underpinning drug sensitivity This revealed an unappreciated positive association between mitochondrial E3 ubiquitin-protein ligase MARCH5 dependency and sensitivity to MCL1 inhibitors in breast cancer cell lines We also estimated drug on-target and off-target activity, informing on specificity, potency and toxicity Linking drug and gene dependency together with genomic data sets uncovered contexts in which molecular networks when perturbed mediate cancer cell loss-of-fitness and thereby provide independent and orthogonal evidence of biomarkers for drug development This study illustrates how integrating cell line drug sensitivity with CRISPR loss-of-function screens can elucidate mechanism-of-action to advance drug development

Journal ArticleDOI
TL;DR: This work presents a generalizable platform for screening and selection of functional bacterial CRISPR‐Cas transcription activators and identifies a novelCRISPR activator, dCas9‐AsiA, that could activate gene expression by more than 200‐fold across genomic and plasmid targets with diverse promoters after directed evolution.
Abstract: Programmable gene activation enables fine-tuned regulation of endogenous and synthetic gene circuits to control cellular behavior. While CRISPR-Cas-mediated gene activation has been extensively developed for eukaryotic systems, similar strategies have been difficult to implement in bacteria. Here, we present a generalizable platform for screening and selection of functional bacterial CRISPR-Cas transcription activators. Using this platform, we identified a novel CRISPR activator, dCas9-AsiA, that could activate gene expression by more than 200-fold across genomic and plasmid targets with diverse promoters after directed evolution. The evolved dCas9-AsiA can simultaneously mediate activation and repression of bacterial regulons in E. coli. We further identified hundreds of promoters with varying basal expression that could be induced by dCas9-AsiA, which provides a rich resource of genetic parts for inducible gene activation. Finally, we show that dCas9-AsiA can be ported to other bacteria of clinical and bioindustrial relevance, thus enabling bacterial CRISPRa in more application areas. This work expands the toolbox for programmable gene regulation in bacteria and provides a useful resource for future engineering of other bacterial CRISPR-based gene regulators.

Journal ArticleDOI
TL;DR: The Induction Dynamics gene Expression Atlas (IDEA) as discussed by the authors is a dataset constructed by independently inducing hundreds of transcription factors (TFs) and measuring timecourses of the resulting gene expression responses in budding yeast.
Abstract: We present IDEA (the Induction Dynamics gene Expression Atlas), a dataset constructed by independently inducing hundreds of transcription factors (TFs) and measuring timecourses of the resulting gene expression responses in budding yeast. Each experiment captures a regulatory cascade connecting a single induced regulator to the genes it causally regulates. We discuss the regulatory cascade of a single TF, Aft1, in detail; however, IDEA contains > 200 TF induction experiments with 20 million individual observations and 100,000 signal-containing dynamic responses. As an application of IDEA, we integrate all timecourses into a whole-cell transcriptional model, which is used to predict and validate multiple new and underappreciated transcriptional regulators. We also find that the magnitudes of coefficients in this model are predictive of genetic interaction profile similarities. In addition to being a resource for exploring regulatory connectivity between TFs and their target genes, our modeling approach shows that combining rapid perturbations of individual genes with genome-scale time-series measurements is an effective strategy for elucidating gene regulatory networks.

Journal ArticleDOI
TL;DR: Advances are presented that enable the complete encoding of an electronic chip in the DNA carried by Escherichia coli, an exemplar of design automation pushing engineering beyond that achievable “by hand”, essential for realizing the potential of biology.
Abstract: Synthetic genetic circuits offer the potential to wield computational control over biology, but their complexity is limited by the accuracy of mathematical models. Here, we present advances that enable the complete encoding of an electronic chip in the DNA carried by Escherichia coli (E. coli). The chip is a binary-coded digit (BCD) to 7-segment decoder, associated with clocks and calculators, to turn on segments to visualize 0-9. Design automation is used to build seven strains, each of which contains a circuit with up to 12 repressors and two activators (totaling 63 regulators and 76,000 bp DNA). The inputs to each circuit represent the digit to be displayed (encoded in binary by four molecules), and output is the segment state, reported as fluorescence. Implementation requires an advanced gate model that captures dynamics, promoter interference, and a measure of total power usage (RNAP flux). This project is an exemplar of design automation pushing engineering beyond that achievable "by hand", essential for realizing the potential of biology.

Journal ArticleDOI
TL;DR: This work generates independent single‐cell RNA‐seq and ATAC‐seq atlases of the Drosophila eye‐antennal disc and spatially integrate the data into a virtual latent space that mimics the organization of the 2D tissue using ScoMAP (Single‐Cell Omics Mapping into spatial Axes using Pseudotime ordering).
Abstract: Single-cell technologies allow measuring chromatin accessibility and gene expression in each cell, but jointly utilizing both layers to map bona fide gene regulatory networks and enhancers remains challenging. Here, we generate independent single-cell RNA-seq and single-cell ATAC-seq atlases of the Drosophila eye-antennal disc and spatially integrate the data into a virtual latent space that mimics the organization of the 2D tissue using ScoMAP (Single-Cell Omics Mapping into spatial Axes using Pseudotime ordering). To validate spatially predicted enhancers, we use a large collection of enhancer-reporter lines and identify ~ 85% of enhancers in which chromatin accessibility and enhancer activity are coupled. Next, we infer enhancer-to-gene relationships in the virtual space, finding that genes are mostly regulated by multiple, often redundant, enhancers. Exploiting cell type-specific enhancers, we deconvolute cell type-specific effects of bulk-derived chromatin accessibility QTLs. Finally, we discover that Prospero drives neuronal differentiation through the binding of a GGG motif. In summary, we provide a comprehensive spatial characterization of gene regulation in a 2D tissue.

Journal ArticleDOI
TL;DR: It is reported that oscillations initiate in embryos, arrest transiently after hatching and in response to perturbation, and cease in adults, and that oscillator arrests occur reproducibly in a specific phase.
Abstract: Gene expression oscillators can structure biological events temporally and spatially. Different biological functions benefit from distinct oscillator properties. Thus, finite developmental processes rely on oscillators that start and stop at specific times, a poorly understood behavior. Here, we have characterized a massive gene expression oscillator comprising > 3,700 genes in Caenorhabditis elegans larvae. We report that oscillations initiate in embryos, arrest transiently after hatching and in response to perturbation, and cease in adults. Experimental observation of the transitions between oscillatory and non-oscillatory states at high temporal resolution reveals an oscillator operating near a Saddle Node on Invariant Cycle (SNIC) bifurcation. These findings constrain the architecture and mathematical models that can represent this oscillator. They also reveal that oscillator arrests occur reproducibly in a specific phase. Since we find oscillations to be coupled to developmental processes, including molting, this characteristic of SNIC bifurcations endows the oscillator with the potential to halt larval development at defined intervals, and thereby execute a developmental checkpoint function.

Journal ArticleDOI
TL;DR: This review provides a brief overview of the technical notions behind generative models and their implementation with deep learning techniques and describes several different ways in which these models can be utilized in practice, using several recent applications in molecular biology as examples.
Abstract: Generative models provide a well-established statistical framework for evaluating uncertainty and deriving conclusions from large data sets especially in the presence of noise, sparsity, and bias. Initially developed for computer vision and natural language processing, these models have been shown to effectively summarize the complexity that underlies many types of data and enable a range of applications including supervised learning tasks, such as assigning labels to images; unsupervised learning tasks, such as dimensionality reduction; and out-of-sample generation, such as de novo image synthesis. With this early success, the power of generative models is now being increasingly leveraged in molecular biology, with applications ranging from designing new molecules with properties of interest to identifying deleterious mutations in our genomes and to dissecting transcriptional variability between single cells. In this review, we provide a brief overview of the technical notions behind generative models and their implementation with deep learning techniques. We then describe several different ways in which these models can be utilized in practice, using several recent applications in molecular biology as examples.

Journal ArticleDOI
TL;DR: Breaking down the observed death rate into two factors, maintenance rate and recycling yield, reveals that slower growing cells display a decreased maintenance rate per cell volume during starvation, thereby decreasing their death rate.
Abstract: Fitness of bacteria is determined both by how fast cells grow when nutrients are abundant and by how well they survive when conditions worsen. Here, we study how prior growth conditions affect the death rate of Escherichia coli during carbon starvation. We control the growth rate prior to starvation either via the carbon source or via a carbon-limited chemostat. We find a consistent dependence where death rate depends on the prior growth conditions only via the growth rate, with slower growth leading to exponentially slower death. Breaking down the observed death rate into two factors, maintenance rate and recycling yield, reveals that slower growing cells display a decreased maintenance rate per cell volume during starvation, thereby decreasing their death rate. In contrast, the ability to scavenge nutrients from carcasses of dead cells (recycling yield) remains constant. Our results suggest a physiological trade-off between rapid proliferation and long survival. We explore the implications of this trade-off within a mathematical model, which can rationalize the observation that bacteria outside of lab environments are not optimized for fast growth.

Journal ArticleDOI
TL;DR: The combination of microfluidic experiments and mathematical model can be a novel tool toward cancer precision medicine and investigate heterogeneity in pancreatic cancer patients, showing dissimilarities especially in the PI3K‐Akt pathway.
Abstract: Mechanistic modeling of signaling pathways mediating patient-specific response to therapy can help to unveil resistance mechanisms and improve therapeutic strategies. Yet, creating such models for patients, in particular for solid malignancies, is challenging. A major hurdle to build these models is the limited material available that precludes the generation of large-scale perturbation data. Here, we present an approach that couples ex vivo high-throughput screenings of cancer biopsies using microfluidics with logic-based modeling to generate patient-specific dynamic models of extrinsic and intrinsic apoptosis signaling pathways. We used the resulting models to investigate heterogeneity in pancreatic cancer patients, showing dissimilarities especially in the PI3K-Akt pathway. Variation in model parameters reflected well the different tumor stages. Finally, we used our dynamic models to efficaciously predict new personalized combinatorial treatments. Our results suggest that our combination of microfluidic experiments and mathematical model can be a novel tool toward cancer precision medicine.

Journal ArticleDOI
TL;DR: A synthetic biology framework is presented to understand and characterize the spatiotemporal patterning properties of the toggle switch and demonstrates how the hysteresis, position, timing, and precision of the boundary can be controlled, highlighting the dynamical flexibility of the circuit.
Abstract: The formation of spatiotemporal patterns of gene expression is frequently guided by gradients of diffusible signaling molecules. The toggle switch subnetwork, composed of two cross-repressing transcription factors, is a common component of gene regulatory networks in charge of patterning, converting the continuous information provided by the gradient into discrete abutting stripes of gene expression. We present a synthetic biology framework to understand and characterize the spatiotemporal patterning properties of the toggle switch. To this end, we built a synthetic toggle switch controllable by diffusible molecules in Escherichia coli. We analyzed the patterning capabilities of the circuit by combining quantitative measurements with a mathematical reconstruction of the underlying dynamical system. The toggle switch can produce robust patterns with sharp boundaries, governed by bistability and hysteresis. We further demonstrate how the hysteresis, position, timing, and precision of the boundary can be controlled, highlighting the dynamical flexibility of the circuit.

Journal ArticleDOI
TL;DR: A new layer of complexity is revealed in the machinery controlling this prevalent modification of proteins and it is suggested that other eukaryotic GNATs may also possess these previously underappreciated broader enzymatic activities.
Abstract: Protein acetylation is a highly frequent protein modification. However, comparatively little is known about its enzymatic machinery. N-a-acetylation (NTA) and e-lysine acetylation (KA) are known to be catalyzed by distinct families of enzymes (NATs and KATs, respectively), although the possibility that the same GCN5-related N-acetyltransferase (GNAT) can perform both functions has been debated. Here, we discovered a new family of plastid-localized GNATs, which possess a dual specificity. All characterized GNAT family members display a number of unique features. Quantitative mass spectrometry analyses revealed that these enzymes exhibit both distinct KA and relaxed NTA speci-ficities. Furthermore, inactivation of GNAT2 leads to significant NTA or KA decreases of several plastid proteins, while proteins of other compartments were unaffected. The data indicate that these enzymes have specific protein targets and likely display partly redundant selectivity, increasing the robustness of the acetylation process in vivo. In summary, this study revealed a new layer of complexity in the machinery controlling this prevalent modification and suggests that other eukaryotic GNATs may also possess these previously underappreciated broader enzy-matic activities.

Journal ArticleDOI
TL;DR: This study mined published datasets to determine the effects of hundreds of clinically approved drugs on ACE2 expression and finds that ACEIs are enriched for ACE2‐upregulating drugs, while antinostat and isotretinoin are the top ACE2 up/downregulators, respectively, in cell lines.
Abstract: The COVID-19 pandemic caused by SARS-CoV-2 has is a global health challenge. Angiotensin-converting enzyme 2 (ACE2) is the host receptor for SARS-CoV-2 entry. Recent studies have suggested that patients with hypertension and diabetes treated with ACE inhibitors (ACEIs) or angiotensin receptor blockers have a higher risk of COVID-19 infection as these drugs could upregulate ACE2, motivating the study of ACE2 modulation by drugs in current clinical use. Here, we mined published datasets to determine the effects of hundreds of clinically approved drugs on ACE2 expression. We find that ACEIs are enriched for ACE2-upregulating drugs, while antineoplastic agents are enriched for ACE2-downregulating drugs. Vorinostat and isotretinoin are the top ACE2 up/downregulators, respectively, in cell lines. Dexamethasone, a corticosteroid used in treating severe acute respiratory syndrome and COVID-19, significantly upregulates ACE2 both in vitro and in vivo. Further top ACE2 regulators in vivo or in primary cells include erlotinib and bleomycin in the lung and vancomycin, cisplatin, and probenecid in the kidney. Our study provides leads for future work studying ACE2 expression modulators.

Journal ArticleDOI
TL;DR: It is elucidated that the dynamic adaptation of the tRNA pool is largely related to the proliferative state across tissues, which functionally determines a condition‐specific expression program both in healthy and tumor states.
Abstract: Different tissues express genes with particular codon usage and anticodon tRNA repertoires. However, the codon-anticodon co-adaptation in humans is not completely understood, nor is its effect on tissue-specific protein levels. Here, we first validated the accuracy of small RNA-seq for tRNA quantification across five human cell lines. We then analyzed the tRNA abundance of more than 8,000 tumor samples from TCGA, together with their paired mRNA-seq and proteomics data, to determine the Supply-to-Demand Adaptation. We thereby elucidate that the dynamic adaptation of the tRNA pool is largely related to the proliferative state across tissues. The distribution of such tRNA pools over the whole cellular translatome affects the subsequent translational efficiency, which functionally determines a condition-specific expression program both in healthy and tumor states. Furthermore, the aberrant translational efficiency of some codons in cancer, exemplified by ProCCA and GlyGGT, is associated with poor patient survival. The regulation of these tRNA profiles is partly explained by the tRNA gene copy numbers and their promoter DNA methylation.

Journal ArticleDOI
TL;DR: This work presents an approach to chemically modify streptavidin, thus rendering it resistant to proteolysis by trypsin and LysC, which results in over 100‐fold reduction of strePTavidin contamination and in better coverage of proteins interacting with various biotinylated bait molecules.
Abstract: Streptavidin-mediated enrichment is a powerful strategy to identify biotinylated biomolecules and their interaction partners; however, intense streptavidin-derived peptides impede protein identification by mass spectrometry. Here, we present an approach to chemically modify streptavidin, thus rendering it resistant to proteolysis by trypsin and LysC. This modification results in over 100-fold reduction of streptavidin contamination and in better coverage of proteins interacting with various biotinylated bait molecules (DNA, protein, and lipid) in an overall simplified workflow.

Journal ArticleDOI
TL;DR: This work combined automated yeast genetics, high‐content screening and neural network‐based image analysis of single cells, focussing on genes that influence the architecture of four subcellular compartments of the endocytic pathway as a model system, to identify 17 distinct mutant phenotypes.
Abstract: Our ability to understand the genotype-to-phenotype relationship is hindered by the lack of detailed understanding of phenotypes at a single-cell level. To systematically assess cell-to-cell phenotypic variability, we combined automated yeast genetics, high-content screening and neural network-based image analysis of single cells, focussing on genes that influence the architecture of four subcellular compartments of the endocytic pathway as a model system. Our unbiased assessment of the morphology of these compartments-endocytic patch, actin patch, late endosome and vacuole-identified 17 distinct mutant phenotypes associated with ~1,600 genes (~30% of all yeast genes). Approximately half of these mutants exhibited multiple phenotypes, highlighting the extent of morphological pleiotropy. Quantitative analysis also revealed that incomplete penetrance was prevalent, with the majority of mutants exhibiting substantial variability in phenotype at the single-cell level. Our single-cell analysis enabled exploration of factors that contribute to incomplete penetrance and cellular heterogeneity, including replicative age, organelle inheritance and response to stress.

Journal ArticleDOI
TL;DR: Genetic landing pads in Escherichia coli at high‐expression sites, flanked by ultrastrong double terminators are designed, enabling the design of synthetic regulatory networks to guide cells in environments or for applications where plasmid use is infeasible.
Abstract: Genetic circuits have many applications, from guiding living therapeutics to ordering process in a bioreactor, but to be useful they have to be genetically stable and not hinder the host. Encoding circuits in the genome reduces burden, but this decreases performance and can interfere with native transcription. We have designed genomic landing pads in Escherichia coli at high-expression sites, flanked by ultrastrong double terminators. DNA payloads >8 kb are targeted to the landing pads using phage integrases. One landing pad is dedicated to carrying a sensor array, and two are used to carry genetic circuits. NOT/NOR gates based on repressors are optimized for the genome and characterized in the landing pads. These data are used, in conjunction with design automation software (Cello 2.0), to design circuits that perform quantitatively as predicted. These circuits require fourfold less RNA polymerase than when carried on a plasmid and are stable for weeks in a recA+ strain without selection. This approach enables the design of synthetic regulatory networks to guide cells in environments or for applications where plasmid use is infeasible.