scispace - formally typeset
Search or ask a question

Showing papers in "Molecular Systems Biology in 2017"


Journal ArticleDOI
TL;DR: A “rectangular” plasma proteome profiling strategy is proposed, in which the proteome patterns of large cohorts are correlated with their phenotypes in health and disease.
Abstract: Clinical analysis of blood is the most widespread diagnostic procedure in medicine, and blood biomarkers are used to categorize patients and to support treatment decisions. However, existing biomarkers are far from comprehensive and often lack specificity and new ones are being developed at a very slow rate. As described in this review, mass spectrometry (MS)-based proteomics has become a powerful technology in biological research and it is now poised to allow the characterization of the plasma proteome in great depth. Previous "triangular strategies" aimed at discovering single biomarker candidates in small cohorts, followed by classical immunoassays in much larger validation cohorts. We propose a "rectangular" plasma proteome profiling strategy, in which the proteome patterns of large cohorts are correlated with their phenotypes in health and disease. Translating such concepts into clinical practice will require restructuring several aspects of diagnostic decision-making, and we discuss some first steps in this direction.

517 citations


Journal ArticleDOI
TL;DR: GECKO is presented, a method that enhances a GEM to account for enzymes as part of reactions, thereby ensuring that each metabolic flux does not exceed its maximum capacity, equal to the product of the enzyme's abundance and turnover number.
Abstract: Genome-scale metabolic models (GEMs) are widely used to calculate metabolic phenotypes. They rely on defining a set of constraints, the most common of which is that the production of metabolites and/or growth are limited by the carbon source uptake rate. However, enzyme abundances and kinetics, which act as limitations on metabolic fluxes, are not taken into account. Here, we present GECKO, a method that enhances a GEM to account for enzymes as part of reactions, thereby ensuring that each metabolic flux does not exceed its maximum capacity, equal to the product of the enzyme's abundance and turnover number. We applied GECKO to a Saccharomyces cerevisiae GEM and demonstrated that the new model could correctly describe phenotypes that the previous model could not, particularly under high enzymatic pressure conditions, such as yeast growing on different carbon sources in excess, coping with stress, or overexpressing a specific pathway. GECKO also allows to directly integrate quantitative proteomics data; by doing so, we significantly reduced flux variability of the model, in over 60% of metabolic reactions. Additionally, the model gives insight into the distribution of enzyme usage between and within metabolic pathways. The developed method and model are expected to increase the use of model-based design in metabolic engineering.

334 citations


Journal ArticleDOI
TL;DR: This study uses a deep convolutional neural network (DeepLoc) to analyze yeast cell images and shows improved performance over traditional approaches in the automated classification of protein subcellular localization.
Abstract: Existing computational pipelines for quantitative analysis of high‐content microscopy data rely on traditional machine learning approaches that fail to accurately classify more than a single dataset without substantial tuning and training, requiring extensive analysis. Here, we demonstrate that the application of deep learning to biological image data can overcome the pitfalls associated with conventional machine learning classifiers. Using a deep convolutional neural network (DeepLoc) to analyze yeast cell images, we show improved performance over traditional approaches in the automated classification of protein subcellular localization. We also demonstrate the ability of DeepLoc to classify highly divergent image sets, including images of pheromone‐arrested cells with abnormal cellular morphology, as well as images generated in different genetic backgrounds and in different laboratories. We offer an open‐source implementation that enables updating DeepLoc on new microscopy datasets. This study highlights deep learning as an important tool for the expedited analysis of high‐content microscopy data.

227 citations


Journal ArticleDOI
TL;DR: The CRISPRi library provides a valuable tool for characterization of pneumococcal genes and pathways and revealed several promising antibiotic targets.
Abstract: Genome-wide screens have discovered a large set of essential genes in the opportunistic human pathogen Streptococcus pneumoniae However, the functions of many essential genes are still unknown, hampering vaccine development and drug discovery. Based on results from transposon sequencing (Tn-seq), we refined the list of essential genes in S. pneumoniae serotype 2 strain D39. Next, we created a knockdown library targeting 348 potentially essential genes by CRISPR interference (CRISPRi) and show a growth phenotype for 254 of them (73%). Using high-content microscopy screening, we searched for essential genes of unknown function with clear phenotypes in cell morphology upon CRISPRi-based depletion. We show that SPD_1416 and SPD_1417 (renamed to MurT and GatD, respectively) are essential for peptidoglycan synthesis, and that SPD_1198 and SPD_1197 (renamed to TarP and TarQ, respectively) are responsible for the polymerization of teichoic acid (TA) precursors. This knowledge enabled us to reconstruct the unique pneumococcal TA biosynthetic pathway. CRISPRi was also employed to unravel the role of the essential Clp-proteolytic system in regulation of competence development, and we show that ClpX is the essential ATPase responsible for ClpP-dependent repression of competence. The CRISPRi library provides a valuable tool for characterization of pneumococcal genes and pathways and revealed several promising antibiotic targets.

216 citations


Journal ArticleDOI
TL;DR: Analysis of reversible drug resistance at a single‐cell level identifies signaling pathways and inhibitory drugs missed by assays that focus on cell populations.
Abstract: Treatment of BRAF ‐mutant melanomas with MAP kinase pathway inhibitors is paradigmatic of the promise of precision cancer therapy but also highlights problems with drug resistance that limit patient benefit. We use live‐cell imaging, single‐cell analysis, and molecular profiling to show that exposure of tumor cells to RAF/MEK inhibitors elicits a heterogeneous response in which some cells die, some arrest, and the remainder adapt to drug. Drug‐adapted cells up‐regulate markers of the neural crest (e.g., NGFR), a melanocyte precursor, and grow slowly. This phenotype is transiently stable, reverting to the drug‐naive state within 9 days of drug withdrawal. Transcriptional profiling of cell lines and human tumors implicates a c‐Jun/ECM/FAK/Src cascade in de‐differentiation in about one‐third of cell lines studied; drug‐induced changes in c‐Jun and NGFR levels are also observed in xenograft and human tumors. Drugs targeting the c‐Jun/ECM/FAK/Src cascade as well as BET bromodomain inhibitors increase the maximum effect ( E max ) of RAF/MEK kinase inhibitors by promoting cell killing. Thus, analysis of reversible drug resistance at a single‐cell level identifies signaling pathways and inhibitory drugs missed by assays that focus on cell populations.

190 citations


Journal ArticleDOI
TL;DR: This work computationally identifies the first biological thiosulfate sensor and an improved tetrathionate sensor, both two‐component systems from marine Shewanella species, and validate them in laboratory Escherichia coli, and develops a method based upon oral gavage and flow cytometry of colon and fecal samples to demonstrate that colon inflammation activates the thios sulfurate sensor in mice harboring native gut microbiota.
Abstract: There is a groundswell of interest in using genetically engineered sensor bacteria to study gut microbiota pathways, and diagnose or treat associated diseases. Here, we computationally identify the first biological thiosulfate sensor and an improved tetrathionate sensor, both two‐component systems from marine Shewanella species, and validate them in laboratory Escherichia coli . Then, we port these sensors into a gut‐adapted probiotic E. coli strain, and develop a method based upon oral gavage and flow cytometry of colon and fecal samples to demonstrate that colon inflammation (colitis) activates the thiosulfate sensor in mice harboring native gut microbiota. Our thiosulfate sensor may have applications in bacterial diagnostics or therapeutics. Finally, our approach can be replicated for a wide range of bacterial sensors and should thus enable a new class of minimally invasive studies of gut microbiota pathways. ![][1] A sensor bacterium that uses a novel two‐component signaling system is engineered to detect thiosulfate and colon inflammation. This work suggests thiosulfate as a novel biomarker of colon inflammation and demonstrates the potential of engineered bacteria in disease diagnostics. Mol Syst Biol. (2017) 13: 923 [1]: /embed/graphic-1.gif

174 citations


Journal ArticleDOI
TL;DR: This work presents hu.MAP, the most comprehensive and accurate human protein complex map to date, containing > 4,600 total complexes, > 7,700 proteins, and > 56,000 unique interactions, including thousands of confident protein interactions not identified by the original publications.
Abstract: Macromolecular protein complexes carry out many of the essential functions of cells, and many genetic diseases arise from disrupting the functions of such complexes. Currently, there is great interest in defining the complete set of human protein complexes, but recent published maps lack comprehensive coverage. Here, through the synthesis of over 9,000 published mass spectrometry experiments, we present hu.MAP, the most comprehensive and accurate human protein complex map to date, containing > 4,600 total complexes, > 7,700 proteins, and > 56,000 unique interactions, including thousands of confident protein interactions not identified by the original publications. hu.MAP accurately recapitulates known complexes withheld from the learning procedure, which was optimized with the aid of a new quantitative metric (k-cliques) for comparing sets of sets. The vast majority of complexes in our map are significantly enriched with literature annotations, and the map overall shows improved coverage of many disease-associated proteins, as we describe in detail for ciliopathies. Using hu.MAP, we predicted and experimentally validated candidate ciliopathy disease genes in vivo in a model vertebrate, discovering CCDC138, WDR90, and KIAA1328 to be new cilia basal body/centriolar satellite proteins, and identifying ANKRD55 as a novel member of the intraflagellar transport machinery. By offering significant improvements to the accuracy and coverage of human protein complexes, hu.MAP (http://proteincomplexes.org) serves as a valuable resource for better understanding the core cellular functions of human proteins and helping to determine mechanistic foundations of human disease.

163 citations


Journal ArticleDOI
TL;DR: The use of INDRA and natural language to model three biological processes of increasing scope are demonstrated, including p53 dynamics in response to DNA damage, adaptive drug resistance in BRAF‐V600E‐mutant melanomas, and the RAS signaling pathway.
Abstract: Word models (natural language descriptions of molecular mechanisms) are a common currency in spoken and written communication in biomedicine but are of limited use in predicting the behavior of complex biological networks. We present an approach to building computational models directly from natural language using automated assembly. Molecular mechanisms described in simple English are read by natural language processing algorithms, converted into an intermediate representation, and assembled into executable or network models. We have implemented this approach in the Integrated Network and Dynamical Reasoning Assembler (INDRA), which draws on existing natural language processing systems as well as pathway information in Pathway Commons and other online resources. We demonstrate the use of INDRA and natural language to model three biological processes of increasing scope: (i) p53 dynamics in response to DNA damage, (ii) adaptive drug resistance in BRAF-V600E-mutant melanomas, and (iii) the RAS signaling pathway. The use of natural language makes the task of developing a model more efficient and it increases model transparency, thereby promoting collaboration with the broader biology community.

151 citations


Journal ArticleDOI
TL;DR: The results highlight the dependency of cytostatic drugs and pharmacogenomic associations on culture systems, and guide culture selection for drug tests, and highlight the importance of standardization in cancer drug screening.
Abstract: Cancer drug screening in patient-derived cells holds great promise for personalized oncology and drug discovery but lacks standardization. Whether cells are cultured as conventional monolayer or advanced, matrix-dependent organoid cultures influences drug effects and thereby drug selection and clinical success. To precisely compare drug profiles in differently cultured primary cells, we developed DeathPro, an automated microscopy-based assay to resolve drug-induced cell death and proliferation inhibition. Using DeathPro, we screened cells from ovarian cancer patients in monolayer or organoid culture with clinically relevant drugs. Drug-induced growth arrest and efficacy of cytostatic drugs differed between the two culture systems. Interestingly, drug effects in organoids were more diverse and had lower therapeutic potential. Genomic analysis revealed novel links between drug sensitivity and DNA repair deficiency in organoids that were undetectable in monolayers. Thus, our results highlight the dependency of cytostatic drugs and pharmacogenomic associations on culture systems, and guide culture selection for drug tests.

143 citations


Journal ArticleDOI
TL;DR: Improved liver function and decreased HS after supplementation with serine (a precursor to glycine) is found in a proof‐of‐concept human study and a strategy for NAFLD treatment is proposed.
Abstract: To elucidate the molecular mechanisms underlying non-alcoholic fatty liver disease (NAFLD), we recruited 86 subjects with varying degrees of hepatic steatosis (HS). We obtained experimental data on lipoprotein fluxes and used these individual measurements as personalized constraints of a hepatocyte genome-scale metabolic model to investigate metabolic differences in liver, taking into account its interactions with other tissues. Our systems level analysis predicted an altered demand for NAD(+) and glutathione (GSH) in subjects with high HS. Our analysis and metabolomic measurements showed that plasma levels of glycine, serine, and associated metabolites are negatively correlated with HS, suggesting that these GSH metabolism precursors might be limiting. Quantification of the hepatic expression levels of the associated enzymes further pointed to altered de novo GSH synthesis. To assess the effect of GSH and NAD(+) repletion on the development of NAFLD, we added precursors for GSH and NAD(+) biosynthesis to the Western diet and demonstrated that supplementation prevents HS in mice. In a proof-of-concept human study, we found improved liver function and decreased HS after supplementation with serine (a precursor to glycine) and hereby propose a strategy for NAFLD treatment.

138 citations


Journal ArticleDOI
TL;DR: This work examines approaches for defining spatial and temporal host–pathogen protein interactions upon infection of a host cell and discusses methods that characterize the regulation of host and pathogen proteomes through alterations in protein abundance, localization, and post‐translational modifications.
Abstract: Organisms are constantly exposed to microbial pathogens in their environments. When a pathogen meets its host, a series of intricate intracellular interactions shape the outcome of the infection. The understanding of these host–pathogen interactions is crucial for the development of treatments and preventive measures against infectious diseases. Over the past decade, proteomic approaches have become prime contributors to the discovery and understanding of host–pathogen interactions that represent anti‐ and pro‐pathogenic cellular responses. Here, we review these proteomic methods and their application to studying viral and bacterial intracellular pathogens. We examine approaches for defining spatial and temporal host–pathogen protein interactions upon infection of a host cell. Further expanding the understanding of proteome organization during an infection, we discuss methods that characterize the regulation of host and pathogen proteomes through alterations in protein abundance, localization, and post‐translational modifications. Finally, we highlight bioinformatic tools available for analyzing such proteomic datasets, as well as novel strategies for integrating proteomics with other omic tools, such as genomics, transcriptomics, and metabolomics, to obtain a systems‐level understanding of infectious diseases. Mol Syst Biol. (2017) 13: 922

Journal ArticleDOI
TL;DR: A deep mutational scanning framework is developed that produces exhaustive maps for human missense variants by combining random codon mutagenesis and multiplexed functional variation assays with computational imputation and refinement.
Abstract: Although we now routinely sequence human genomes, we can confidently identify only a fraction of the sequence variants that have a functional impact. Here, we developed a deep mutational scanning framework that produces exhaustive maps for human missense variants by combining random codon mutagenesis and multiplexed functional variation assays with computational imputation and refinement. We applied this framework to four proteins corresponding to six human genes: UBE2I (encoding SUMO E2 conjugase), SUMO1 (small ubiquitin-like modifier), TPK1 (thiamin pyrophosphokinase), and CALM1/2/3 (three genes encoding the protein calmodulin). The resulting maps recapitulate known protein features and confidently identify pathogenic variation. Assays potentially amenable to deep mutational scanning are already available for 57% of human disease genes, suggesting that DMS could ultimately map functional variation for all human disease genes.

Journal ArticleDOI
TL;DR: Recognizing condition‐dependent compensatory mechanisms of antibiotic resistance, such as the shift from respiratory to fermentative metabolism of glucose upon overexpression of efflux pumps, opens new perspectives in the fight against emerging antibiotic resistance.
Abstract: Despite our continuous improvement in understanding antibiotic resistance, the interplay between natural selection of resistance mutations and the environment remains unclear. To investigate the role of bacterial metabolism in constraining the evolution of antibiotic resistance, we evolved Escherichia coli growing on glycolytic or gluconeogenic carbon sources to the selective pressure of three different antibiotics. Profiling more than 500 intracellular and extracellular putative metabolites in 190 evolved populations revealed that carbon and energy metabolism strongly constrained the evolutionary trajectories, both in terms of speed and mode of resistance acquisition. To interpret and explore the space of metabolome changes, we developed a novel constraint-based modeling approach using the concept of shadow prices. This analysis, together with genome resequencing of resistant populations, identified condition-dependent compensatory mechanisms of antibiotic resistance, such as the shift from respiratory to fermentative metabolism of glucose upon overexpression of efflux pumps. Moreover, metabolome-based predictions revealed emerging weaknesses in resistant strains, such as the hypersensitivity to fosfomycin of ampicillin-resistant strains. Overall, resolving metabolic adaptation throughout antibiotic-driven evolutionary trajectories opens new perspectives in the fight against emerging antibiotic resistance.

Journal ArticleDOI
TL;DR: A surprisingly simple regulatory program that relies on global transcriptional regulation and input from few intracellular metabolites appears to be sufficient to coordinate E. coli central metabolism and explain about 90% of the experimentally observed transcription changes in 100 genes.
Abstract: Transcription networks consist of hundreds of transcription factors with thousands of often overlapping target genes. While we can reliably measure gene expression changes, we still understand relatively little why expression changes the way it does. How does a coordinated response emerge in such complex networks and how many input signals are necessary to achieve it? Here, we unravel the regulatory program of gene expression in Escherichia coli central carbon metabolism with more than 30 known transcription factors. Using a library of fluorescent transcriptional reporters, we comprehensively quantify the activity of central metabolic promoters in 26 environmental conditions. The expression patterns were dominated by growth rate‐dependent global regulation for most central metabolic promoters in concert with highly condition‐specific activation for only few promoters. Using an approximate mathematical description of promoter activity, we dissect the contribution of global and specific transcriptional regulation. About 70% of the total variance in promoter activity across conditions was explained by global transcriptional regulation. Correlating the remaining specific transcriptional regulation of each promoter with the cell's metabolome response across the same conditions identified potential regulatory metabolites. Remarkably, cyclic AMP, fructose‐1,6‐bisphosphate, and fructose‐1‐phosphate alone explained most of the specific transcriptional regulation through their interaction with the two major transcription factors Crp and Cra. Thus, a surprisingly simple regulatory program that relies on global transcriptional regulation and input from few intracellular metabolites appears to be sufficient to coordinate E. coli central metabolism and explain about 90% of the experimentally observed transcription changes in 100 genes. ![][1] High‐throughput quantification of promoter activity and metabolite concentrations combined with mathematical modeling show that a simple transcriptional regulatory program and input from few metabolites control E. coli central metabolism. Mol Syst Biol. (2017) 13: 903 [1]: /embed/graphic-1.gif

Journal ArticleDOI
TL;DR: A large‐scale survey of population structure in prevalent human gut microbial species, sampled from their natural environment, with a culture‐independent metagenomic approach provides evidence for subspecies in the majority of abundant gut prokaryotes, leading to a better functional and ecological understanding of the human gut microbiome in conjunction with its host.
Abstract: Population genomics of prokaryotes has been studied in depth in only a small number of primarily pathogenic bacteria, as genome sequences of isolates of diverse origin are lacking for most species. Here, we conducted a large-scale survey of population structure in prevalent human gut microbial species, sampled from their natural environment, with a culture-independent metagenomic approach. We examined the variation landscape of 71 species in 2,144 human fecal metagenomes and found that in 44 of these, accounting for 72% of the total assigned microbial abundance, single-nucleotide variation clearly indicates the existence of sub-populations (here termed subspecies). A single subspecies (per species) usually dominates within each host, as expected from ecological theory. At the global scale, geographic distributions of subspecies differ between phyla, with Firmicutes subspecies being significantly more geographically restricted. To investigate the functional significance of the delineated subspecies, we identified genes that consistently distinguish them in a manner that is independent of reference genomes. We further associated these subspecies-specific genes with properties of the microbial community and the host. For example, two of the three Eubacterium rectale subspecies consistently harbor an accessory pro-inflammatory flagellum operon that is associated with lower gut community diversity, higher host BMI, and higher blood fasting insulin levels. Using an additional 676 human oral samples, we further demonstrate the existence of niche specialized subspecies in the different parts of the oral cavity. Taken together, we provide evidence for subspecies in the majority of abundant gut prokaryotes, leading to a better functional and ecological understanding of the human gut microbiome in conjunction with its host.

Journal ArticleDOI
TL;DR: A theory that predicts whether an isotropic network will contract, expand, or conserve its dimensions is described and it is suggested that pulsatility is an intrinsic behavior of contractile networks if the filaments are not stable but turn over.
Abstract: Morphogenesis in animal tissues is largely driven by actomyosin networks, through tensions generated by an active contractile process. Although the network components and their properties are known, and networks can be reconstituted in vitro, the requirements for contractility are still poorly understood. Here, we describe a theory that predicts whether an isotropic network will contract, expand, or conserve its dimensions. This analytical theory correctly predicts the behavior of simulated networks, consisting of filaments with varying combinations of connectors, and reveals conditions under which networks of rigid filaments are either contractile or expansile. Our results suggest that pulsatility is an intrinsic behavior of contractile networks if the filaments are not stable but turn over. The theory offers a unifying framework to think about mechanisms of contractions or expansion. It provides the foundation for studying a broad range of processes involving cytoskeletal networks and a basis for designing synthetic networks.

Journal ArticleDOI
TL;DR: It is shown that crude cellular extracts of a eukaryotic thermophile, Chaetomium thermophilum, retain basic principles of cellular organization, and the structure of fatty acid synthase is investigated by cryoEM to reveal multiple, flexible states of the enzyme in adaptation to its association with other complexes.
Abstract: The arrangement of proteins into complexes is a key organizational principle for many cellular functions. Although the topology of many complexes has been systematically analyzed in isolation, their molecular sociology in situ remains elusive. Here, we show that crude cellular extracts of a eukaryotic thermophile, Chaetomium thermophilum, retain basic principles of cellular organization. Using a structural proteomics approach, we simultaneously characterized the abundance, interactions, and structure of a third of the C. thermophilum proteome within these extracts. We identified 27 distinct protein communities that include 108 interconnected complexes, which dynamically associate with each other and functionally benefit from being in close proximity in the cell. Furthermore, we investigated the structure of fatty acid synthase within these extracts by cryoEM and this revealed multiple, flexible states of the enzyme in adaptation to its association with other complexes, thus exemplifying the need for in situ studies. As the components of the captured protein communities are known-at both the protein and complex levels-this study constitutes another step forward toward a molecular understanding of subcellular organization.

Journal ArticleDOI
Tobias Fuhrer1, Mattia Zampieri1, Daniel C. Sévin1, Uwe Sauer1, Nicola Zamboni1 
TL;DR: This work systematically mapped the association between > 3,800 single‐gene deletions in the bacterium Escherichia coli and relative concentrations of > 7,000 intracellular metabolite ions and reveals a largely unknown landscape of gene–metabolite interactions that are not represented in metabolic models.
Abstract: Metabolism is one of the best‐understood cellular processes whose network topology of enzymatic reactions is determined by an organism's genome. The influence of genes on metabolite levels, however, remains largely unknown, particularly for the many genes encoding non‐enzymatic proteins. Serendipitously, genomewide association studies explore the relationship between genetic variants and metabolite levels, but a comprehensive interaction network has remained elusive even for the simplest single‐celled organisms. Here, we systematically mapped the association between > 3,800 single‐gene deletions in the bacterium Escherichia coli and relative concentrations of > 7,000 intracellular metabolite ions. Beyond expected metabolic changes in the proximity to abolished enzyme activities, the association map reveals a largely unknown landscape of gene–metabolite interactions that are not represented in metabolic models. Therefore, the map provides a unique resource for assessing the genetic basis of metabolic changes and conversely hypothesizing metabolic consequences of genetic alterations. We illustrate this by predicting metabolism‐related functions of 72 so far not annotated genes and by identifying key genes mediating the cellular response to environmental perturbations. ![][1] The metabolome of > 3,800 single Escherichia coli gene deletion mutants is analyzed. The obtained gene–metabolite interaction map allows predicting orphan gene functions and interpreting a cell's response to environmental perturbations. Mol Syst Biol. (2017) 13: 907 [1]: /embed/graphic-1.gif

Journal ArticleDOI
TL;DR: In this article, the authors determined the proteome-wide signatures of the RPD3/HDA1 class of histone deacetylases in Arabidopsis and found that at least 30 proteins function in nucleic acid binding.
Abstract: Histone deacetylases have central functions in regulating stress defenses and development in plants. However, the knowledge about the deacetylase functions is largely limited to histones, although these enzymes were found in diverse subcellular compartments. In this study, we determined the proteome-wide signatures of the RPD3/HDA1 class of histone deacetylases in Arabidopsis. Relative quantification of the changes in the lysine acetylation levels was determined on a proteome-wide scale after treatment of Arabidopsis leaves with deacetylase inhibitors apicidin and trichostatin A. We identified 91 new acetylated candidate proteins other than histones, which are potential substrates of the RPD3/HDA1-like histone deacetylases in Arabidopsis, of which at least 30 of these proteins function in nucleic acid binding. Furthermore, our analysis revealed that histone deacetylase 14 (HDA14) is the first organellar-localized RPD3/HDA1 class protein found to reside in the chloroplasts and that the majority of its protein targets have functions in photosynthesis. Finally, the analysis of HDA14 loss-of-function mutants revealed that the activation state of RuBisCO is controlled by lysine acetylation of RuBisCO activase under low-light conditions.

Journal ArticleDOI
TL;DR: This work resequenced data from previously published HT‐SELEX experiments, the most extensive mammalian TF–DNA binding data available to date, to reveal the nucleotide position‐dependent DNA shape readout in TF‐binding sites and the TF family‐specific position dependence.
Abstract: Transcription factors (TFs) achieve DNA-binding specificity through contacts with functional groups of bases (base readout) and readout of structural properties of the double helix (shape readout). Currently, it remains unclear whether DNA shape readout is utilized by only a few selected TF families, or whether this mechanism is used extensively by most TF families. We resequenced data from previously published HT-SELEX experiments, the most extensive mammalian TF-DNA binding data available to date. Using these data, we demonstrated the contributions of DNA shape readout across diverse TF families and its importance in core motif-flanking regions. Statistical machine-learning models combined with feature-selection techniques helped to reveal the nucleotide position-dependent DNA shape readout in TF-binding sites and the TF family-specific position dependence. Based on these results, we proposed novel DNA shape logos to visualize the DNA shape preferences of TFs. Overall, this work suggests a way of obtaining mechanistic insights into TF-DNA binding without relying on experimentally solved all-atom structures.

Journal ArticleDOI
TL;DR: Integrative network analyses identified liver‐specific genes linked to NAFLD pathogenesis, such as pyruvate kinase liver and red blood cell (PKLR), or to HCC pathogenic, or to PKLR, patatin‐like phospholipase domain containing 3 (PNPLA3), and proprotein convertase subtilisin/kexin type 9 (PCSK9), all of which are potential targets for drug development.
Abstract: We performed integrative network analyses to identify targets that can be used for effectively treating liver diseases with minimal side effects. We first generated co-expression networks (CNs) for 46 human tissues and liver cancer to explore the functional relationships between genes and examined the overlap between functional and physical interactions. Since increased de novo lipogenesis is a characteristic of nonalcoholic fatty liver disease (NAFLD) and hepatocellular carcinoma (HCC), we investigated the liver-specific genes co-expressed with fatty acid synthase (FASN). CN analyses predicted that inhibition of these liver-specific genes decreases FASN expression. Experiments in human cancer cell lines, mouse liver samples, and primary human hepatocytes validated our predictions by demonstrating functional relationships between these liver genes, and showing that their inhibition decreases cell growth and liver fat content. In conclusion, we identified liver-specific genes linked to NAFLD pathogenesis, such as pyruvate kinase liver and red blood cell (PKLR), or to HCC pathogenesis, such as PKLR, patatin-like phospholipase domain containing 3 (PNPLA3), and proprotein convertase subtilisin/kexin type 9 (PCSK9), all of which are potential targets for drug development.

Journal ArticleDOI
TL;DR: It is shown that coexpression of bidirectional gene pairs, and closeby genes in general, is buffered at the protein level, which supports the hypothesis that the selection for noise reduction is a major driver of the evolution of genome organisation.
Abstract: Genes are not randomly distributed in the genome. In humans, 10% of protein-coding genes are transcribed from bidirectional promoters and many more are organised in larger clusters. Intriguingly, neighbouring genes are frequently coexpressed but rarely functionally related. Here we show that coexpression of bidirectional gene pairs, and closeby genes in general, is buffered at the protein level. Taking into account the 3D architecture of the genome, we find that co-regulation of spatially close, functionally unrelated genes is pervasive at the transcriptome level, but does not extend to the proteome. We present evidence that non-functional mRNA coexpression in human cells arises from stochastic chromatin fluctuations and direct regulatory interference between spatially close genes. Protein-level buffering likely reflects a lack of coordination of post-transcriptional regulation of functionally unrelated genes. Grouping human genes together along the genome sequence, or through long-range chromosome folding, is associated with reduced expression noise. Our results support the hypothesis that the selection for noise reduction is a major driver of the evolution of genome organisation.

Journal ArticleDOI
TL;DR: To systematically map cargo–NTR relationships in situ, the engineered biotin ligase BirA* was systematically fused to 16 NTRs and the BioID method was extended by the direct identification of biotinylation sites to identify interaction interfaces and to discriminate direct versus piggyback transport mechanisms.
Abstract: Nuclear transport receptors (NTRs) recognize localization signals of cargos to facilitate their passage across the central channel of nuclear pore complexes (NPCs). About 30 different NTRs constitute different transport pathways in humans and bind to a multitude of different cargos. The exact cargo spectrum of the majority of NTRs, their specificity and even the extent to which active nucleocytoplasmic transport contributes to protein localization remains understudied because of the transient nature of these interactions and the wide dynamic range of cargo concentrations. To systematically map cargo–NTR relationships in situ , we used proximity ligation coupled to mass spectrometry (BioID). We systematically fused the engineered biotin ligase BirA* to 16 NTRs. We estimate that a considerable fraction of the human proteome is subject to active nuclear transport. We quantified the specificity and redundancy in NTR interactions and identified transport pathways for cargos. We extended the BioID method by the direct identification of biotinylation sites. This approach enabled us to identify interaction interfaces and to discriminate direct versus piggyback transport mechanisms. Data are available via ProteomeXchange with identifier PXD007976.

Journal ArticleDOI
TL;DR: This work introduces RNA‐seq as a powerful method for circuit characterization and debugging that overcomes the limitations of fluorescent reporters and scales to large systems composed of many parts.
Abstract: Genetic circuits implement computational operations within a cell. Debugging them is difficult because their function is defined by multiple states (e.g., combinations of inputs) that vary in time. Here, we develop RNA‐seq methods that enable the simultaneous measurement of: (i) the states of internal gates, (ii) part performance (promoters, insulators, terminators), and (iii) impact on host gene expression. This is applied to a three‐input one‐output circuit consisting of three sensors, five NOR/NOT gates, and 46 genetic parts. Transcription profiles are obtained for all eight combinations of inputs, from which biophysical models can extract part activities and the response functions of sensors and gates. Various unexpected failure modes are identified, including cryptic antisense promoters, terminator failure, and a sensor malfunction due to media‐induced changes in host gene expression. This can guide the selection of new parts to fix these problems, which we demonstrate by using a bidirectional terminator to disrupt observed antisense transcription. This work introduces RNA‐seq as a powerful method for circuit characterization and debugging that overcomes the limitations of fluorescent reporters and scales to large systems composed of many parts.

Journal ArticleDOI
TL;DR: The integrated mathematical model of Epo‐driven proliferation explains cell type‐specific effects of targeted AKT and ERK inhibitors and faithfully predicts, based on the protein abundance, anti‐proliferative effects of inhibitors in primary human erythroid progenitor cells, that the effectiveness of targeted cancer therapy might become predictable from protein abundance.
Abstract: Signaling through the AKT and ERK pathways controls cell proliferation. However, the integrated regulation of this multistep process, involving signal processing, cell growth and cell cycle progression, is poorly understood. Here, we study different hematopoietic cell types, in which AKT and ERK signaling is triggered by erythropoietin (Epo). Although these cell types share the molecular network topology for pro-proliferative Epo signaling, they exhibit distinct proliferative responses. Iterating quantitative experiments and mathematical modeling, we identify two molecular sources for cell type-specific proliferation. First, cell type-specific protein abundance patterns cause differential signal flow along the AKT and ERK pathways. Second, downstream regulators of both pathways have differential effects on proliferation, suggesting that protein synthesis is rate-limiting for faster cycling cells while slower cell cycles are controlled at the G1-S progression. The integrated mathematical model of Epo-driven proliferation explains cell type-specific effects of targeted AKT and ERK inhibitors and faithfully predicts, based on the protein abundance, anti-proliferative effects of inhibitors in primary human erythroid progenitor cells. Our findings suggest that the effectiveness of targeted cancer therapy might become predictable from protein abundance.

Journal ArticleDOI
TL;DR: A computationally survey of bi‐functional circuits which show no simple structural modularity, and reveals two distinct classes: hybrid circuits which overlay two simpler mono‐functional sub‐circuits within their circuitry, and emergent circuits, which do not.
Abstract: A major challenge in systems biology is to understand the relationship between a circuit's structure and its function, but how is this relationship affected if the circuit must perform multiple distinct functions within the same organism? In particular, to what extent do multi-functional circuits contain modules which reflect the different functions? Here, we computationally survey a range of bi-functional circuits which show no simple structural modularity: They can switch between two qualitatively distinct functions, while both functions depend on all genes of the circuit. Our analysis reveals two distinct classes: hybrid circuits which overlay two simpler mono-functional sub-circuits within their circuitry, and emergent circuits, which do not. In this second class, the bi-functionality emerges from more complex designs which are not fully decomposable into distinct modules and are consequently less intuitive to predict or understand. These non-intuitive emergent circuits are just as robust as their hybrid counterparts, and we therefore suggest that the common bias toward studying modular systems may hinder our understanding of real biological circuits.

Journal ArticleDOI
TL;DR: A temporal‐fluxomics approach is developed to derive a comprehensive and quantitative view of alterations in metabolic fluxes throughout the mammalian cell cycle by combining pulse‐chase LC‐MS‐based isotope tracing in synchronized cell populations with computational deconvolution and metabolic flux modeling.
Abstract: Cellular metabolic demands change throughout the cell cycle. Nevertheless, a characterization of how metabolic fluxes adapt to the changing demands throughout the cell cycle is lacking. Here, we developed a temporal-fluxomics approach to derive a comprehensive and quantitative view of alterations in metabolic fluxes throughout the mammalian cell cycle. This is achieved by combining pulse-chase LC-MS-based isotope tracing in synchronized cell populations with computational deconvolution and metabolic flux modeling. We find that TCA cycle fluxes are rewired as cells progress through the cell cycle with complementary oscillations of glucose versus glutamine-derived fluxes: Oxidation of glucose-derived flux peaks in late G1 phase, while oxidative and reductive glutamine metabolism dominates S phase. These complementary flux oscillations maintain a constant production rate of reducing equivalents and oxidative phosphorylation flux throughout the cell cycle. The shift from glucose to glutamine oxidation in S phase plays an important role in cell cycle progression and cell proliferation.

Journal ArticleDOI
TL;DR: This work used a modified membrane yeast two‐hybrid approach and identified interacting partners for 48 selected full‐length human ligand‐unoccupied GPCRs in their native membrane environment to obtain a global view of GPCR‐mediated signaling and identify novel components of their pathways.
Abstract: G-protein-coupled receptors (GPCRs) are the largest family of integral membrane receptors with key roles in regulating signaling pathways targeted by therapeutics, but are difficult to study using existing proteomics technologies due to their complex biochemical features. To obtain a global view of GPCR-mediated signaling and to identify novel components of their pathways, we used a modified membrane yeast two-hybrid (MYTH) approach and identified interacting partners for 48 selected full-length human ligand-unoccupied GPCRs in their native membrane environment. The resulting GPCR interactome connects 686 proteins by 987 unique interactions, including 299 membrane proteins involved in a diverse range of cellular functions. To demonstrate the biological relevance of the GPCR interactome, we validated novel interactions of the GPR37, serotonin 5-HT4d, and adenosine ADORA2A receptors. Our data represent the first large-scale interactome mapping for human GPCRs and provide a valuable resource for the analysis of signaling pathways involving this druggable family of integral membrane proteins.

Journal ArticleDOI
TL;DR: This study investigated a complex, highly interconnected network of 20 Arabidopsis transcription factors at the basis of leaf growth inhibition upon mild osmotic stress, resulting in the identification of a core network, composed of ERF6, ERF8, ERf9,ERF59, and ERF98, which is responsible for most transcriptional connections.
Abstract: Plants have established different mechanisms to cope with environmental fluctuations and accordingly fine-tune their growth and development through the regulation of complex molecular networks. It is largely unknown how the network architectures change and what the key regulators in stress responses and plant growth are. Here, we investigated a complex, highly interconnected network of 20 Arabidopsis transcription factors (TFs) at the basis of leaf growth inhibition upon mild osmotic stress. We tracked the dynamic behavior of the stress-responsive TFs over time, showing the rapid induction following stress treatment, specifically in growing leaves. The connections between the TFs were uncovered using inducible overexpression lines and were validated with transient expression assays. This study resulted in the identification of a core network, composed of ERF6, ERF8, ERF9, ERF59, and ERF98, which is responsible for most transcriptional connections. The analyses highlight the biological function of this core network in environmental adaptation and its redundancy. Finally, a phenotypic analysis of loss-of-function and gain-of-function lines of the transcription factors established multiple connections between the stress-responsive network and leaf growth.

Journal ArticleDOI
TL;DR: It is demonstrated that metabolic stress acts as a selective pressure underlying the recurrent CNAs observed in human tumors, and further cast genomic instability as an enabling event in tumorigenesis and metabolic evolution.
Abstract: Copy number alteration (CNA) profiling of human tumors has revealed recurrent patterns of DNA amplifications and deletions across diverse cancer types. These patterns are suggestive of conserved selection pressures during tumor evolution but cannot be fully explained by known oncogenes and tumor suppressor genes. Using a pan-cancer analysis of CNA data from patient tumors and experimental systems, here we show that principal component analysis-defined CNA signatures are predictive of glycolytic phenotypes, including 18F-fluorodeoxy-glucose (FDG) avidity of patient tumors, and increased proliferation. The primary CNA signature is enriched for p53 mutations and is associated with glycolysis through coordinate amplification of glycolytic genes and other cancer-linked metabolic enzymes. A pan-cancer and cross-species comparison of CNAs highlighted 26 consistently altered DNA regions, containing 11 enzymes in the glycolysis pathway in addition to known cancer-driving genes. Furthermore, exogenous expression of hexokinase and enolase enzymes in an experimental immortalization system altered the subsequent copy number status of the corresponding endogenous loci, supporting the hypothesis that these metabolic genes act as drivers within the conserved CNA amplification regions. Taken together, these results demonstrate that metabolic stress acts as a selective pressure underlying the recurrent CNAs observed in human tumors, and further cast genomic instability as an enabling event in tumorigenesis and metabolic evolution.