scispace - formally typeset
Search or ask a question

Showing papers by "Cold Spring Harbor Laboratory published in 2018"


Journal ArticleDOI
17 Apr 2018-Immunity
TL;DR: An extensive immunogenomic analysis of more than 10,000 tumors comprising 33 diverse cancer types by utilizing data compiled by TCGA identifies six immune subtypes that encompass multiple cancer types and are hypothesized to define immune response patterns impacting prognosis.

3,246 citations


Journal ArticleDOI
TL;DR: By parsing the unique classes and subclasses of tumor immune microenvironment (TIME) that exist within a patient’s tumor, the ability to predict and guide immunotherapeutic responsiveness will improve, and new therapeutic targets will be revealed.
Abstract: The clinical successes in immunotherapy have been both astounding and at the same time unsatisfactory. Countless patients with varied tumor types have seen pronounced clinical response with immunotherapeutic intervention; however, many more patients have experienced minimal or no clinical benefit when provided the same treatment. As technology has advanced, so has the understanding of the complexity and diversity of the immune context of the tumor microenvironment and its influence on response to therapy. It has been possible to identify different subclasses of immune environment that have an influence on tumor initiation and response and therapy; by parsing the unique classes and subclasses of tumor immune microenvironment (TIME) that exist within a patient's tumor, the ability to predict and guide immunotherapeutic responsiveness will improve, and new therapeutic targets will be revealed.

2,920 citations


Journal ArticleDOI
22 Jun 2018-Science
TL;DR: It is demonstrated that, in the general population, the personality trait neuroticism is significantly correlated with almost every psychiatric disorder and migraine, and it is shown that both psychiatric and neurological disorders have robust correlations with cognitive and personality measures.
Abstract: Disorders of the brain can exhibit considerable epidemiological comorbidity and often share symptoms, provoking debate about their etiologic overlap. We quantified the genetic sharing of 25 brain disorders from genome-wide association studies of 265,218 patients and 784,643 control participants and assessed their relationship to 17 phenotypes from 1,191,588 individuals. Psychiatric disorders share common variant risk, whereas neurological disorders appear more distinct from one another and from the psychiatric disorders. We also identified significant sharing between disorders and a number of brain phenotypes, including cognitive measures. Further, we conducted simulations to explore how statistical power, diagnostic misclassification, and phenotypic heterogeneity affect genetic correlations. These results highlight the importance of common genetic variation as a risk factor for brain disorders and the value of heritability-based methods in understanding their etiology.

1,357 citations



Journal ArticleDOI
TL;DR: NGMLR and Sniffles perform highly accurate alignment and structural variation detection from long-read sequencing data and can automatically filter false events and operate on low-coverage data, thereby reducing the high costs that have hindered the application of long reads in clinical and research settings.
Abstract: Structural variations are the greatest source of genetic variation, but they remain poorly understood because of technological limitations. Single-molecule long-read sequencing has the potential to dramatically advance the field, although high error rates are a challenge with existing methods. Addressing this need, we introduce open-source methods for long-read alignment (NGMLR; https://github.com/philres/ngmlr ) and structural variant identification (Sniffles; https://github.com/fritzsedlazeck/Sniffles ) that provide unprecedented sensitivity and precision for variant detection, even in repeat-rich regions and for complex nested events that can have substantial effects on human health. In several long-read datasets, including healthy and cancerous human genomes, we discovered thousands of novel variants and categorized systematic errors in short-read approaches. NGMLR and Sniffles can automatically filter false events and operate on low-coverage data, thereby reducing the high costs that have hindered the application of long reads in clinical and research settings.

1,058 citations


Journal ArticleDOI
28 Sep 2018-Science
TL;DR: The data implicate NETs and NET-mediated ECM remodeling as critical mediators of inflammation-induced awakening in mouse models of dormancy and propose that NETs awaken cancer by concentrating neutrophil proteases at the ECM protein laminin.
Abstract: INTRODUCTION Most cancer patients die from cancer that recurs after spreading to a different tissue, rather than from their original tumor. After successful treatment of the original tumor, cancer cells that have disseminated to other sites can undergo dormancy, remaining viable but not proliferating. In breast, prostate, and other cancers, cancer cells can remain dormant and clinically undetectable for years and even decades before recurring, or awakening, as metastatic cancer. Little is known about what might initiate cancer awakening, and this in turn reduces our opportunities to prevent metastasis. RATIONALE Epidemiological studies have suggested that inflammation is linked to a higher risk of breast cancer recurrence after a period of clinical dormancy. Smoking, which causes chronic lung inflammation, is also associated with a higher risk of recurrence. However, whether inflammation can cause awakening is not clear. Inflammatory cells, such as neutrophils, can provide many different signals that promote cancer progression. Neutrophils can kill harmful microorganisms by the release of neutrophil extracellular traps (NETs) into the extracellular space. NETs are scaffolds of DNA with associated cytotoxic proteins and proteases [e.g., neutrophil elastase (NE) and matrix metalloproteinase 9 (MMP9)]. NETs induced by bacteria or by cancer cells can promote metastasis, but the mechanism by which this occurs is not known. In this study, we tested whether NETs formed during lung inflammation could induce awakening. RESULTS We found that sustained experimental lung inflammation—induced by either tobacco smoke exposure or nasal instillation of lipopolysaccharide (LPS)—converted dormant cancer cells to aggressive lung metastases in mice. Both types of sustained inflammation also caused the formation of NETs. Inhibiting NET formation or digesting the NETs’ DNA scaffold prevented conversion of single disseminated cancer cells to growing metastases in mouse models of breast and prostate cancer. The NET DNA bound to the extracellular matrix (ECM) protein laminin, thus bringing two NET-associated proteases, NE and MMP9, to their substrate. This in turn facilitated a sequential cleavage of laminin, first by NE and then by MMP9. The NET-mediated proteolytic remodeling of laminin revealed an epitope that triggered proliferation of dormant cancer cells through integrin activation and FAK/ERK/MLCK/YAP signaling. We generated a blocking antibody against NET-remodeled laminin, and this antibody prevented or reduced tobacco smoke exposure– or LPS-induced inflammation from awakening dormant cancer cells in mice. CONCLUSION Our data implicate NETs and NET-mediated ECM remodeling as critical mediators of inflammation-induced awakening in mouse models of dormancy. We propose that NETs awaken cancer by concentrating neutrophil proteases at the ECM protein laminin, allowing for sequential proteolytic remodeling of laminin and leading to integrin-mediated signaling in the cancer cells. Our findings set the stage for epidemiological studies to test possible links among inflammation or smoking, NETs, and recurrence after dormancy in human patients. If such links can be established, we envision that approaches similar to the ones used in mouse models in our study could be used to target NETs and their downstream effectors to reduce the risk of cancer recurrence in human patients.

779 citations


Journal ArticleDOI
Adam P. Arkin1, Adam P. Arkin2, Robert W. Cottingham3, Christopher S. Henry4, Nomi L. Harris2, Rick Stevens5, Sergei Maslov6, Paramvir S. Dehal2, Doreen Ware7, Fernando Perez, Shane Canon2, Michael W. Sneddon2, Matthew L. Henderson2, William J. Riehl2, Dan Murphy-Olson4, Stephen Y. Chan2, Roy T. Kamimura2, Sunita Kumari7, Meghan M Drake3, Thomas Brettin4, Elizabeth M. Glass4, Dylan Chivian2, Dan Gunter2, David J. Weston3, Benjamin H. Allen3, Jason K. Baumohl2, Aaron A. Best8, Benjamin P. Bowen2, Steven E. Brenner1, Christopher Bun4, John-Marc Chandonia2, Jer Ming Chia7, R. L. Colasanti4, Neal Conrad4, James J. Davis4, Brian H. Davison3, Matthew DeJongh8, Scott Devoid4, Emily M. Dietrich4, Inna Dubchak2, Janaka N. Edirisinghe4, Janaka N. Edirisinghe5, Gang Fang9, José P. Faria4, Paul M. Frybarger4, Wolfgang Gerlach4, Mark Gerstein9, Annette Greiner2, James Gurtowski7, Holly L. Haun3, Fei He6, Rashmi Jain2, Rashmi Jain10, Marcin P. Joachimiak2, Kevin P. Keegan4, Shinnosuke Kondo8, Vivek Kumar7, Miriam Land3, Folker Meyer4, Mark Mills3, Pavel S. Novichkov2, Taeyun Oh10, Taeyun Oh2, Gary J. Olsen11, Robert Olson4, Bruce Parrello4, Shiran Pasternak7, Erik Pearson2, Sarah S. Poon2, Gavin Price2, Srividya Ramakrishnan7, Priya Ranjan3, Priya Ranjan12, Pamela C. Ronald10, Pamela C. Ronald2, Michael C. Schatz7, Samuel M. D. Seaver4, Maulik Shukla4, Roman A. Sutormin2, Mustafa H Syed3, James Thomason7, Nathan L. Tintle8, Daifeng Wang9, Fangfang Xia4, Hyunseung Yoo4, Shinjae Yoo6, Dantong Yu6 
TL;DR: Author(s): Arkin, Adam P; Cottingham, Robert W; Henry, Christopher S; Harris, Nomi L; Stevens, Rick L; Maslov, Sergei; Dehal, Paramvir; Ware, Doreen; Perez, Fernando; Canon, Shane; Sneddon, Michael W; Henderson, Matthew L; Riehl, William J; Murphy-Olson, Dan; Chan, Stephen Y; Kamimura, Roy T.
Abstract: Author(s): Arkin, Adam P; Cottingham, Robert W; Henry, Christopher S; Harris, Nomi L; Stevens, Rick L; Maslov, Sergei; Dehal, Paramvir; Ware, Doreen; Perez, Fernando; Canon, Shane; Sneddon, Michael W; Henderson, Matthew L; Riehl, William J; Murphy-Olson, Dan; Chan, Stephen Y; Kamimura, Roy T; Kumari, Sunita; Drake, Meghan M; Brettin, Thomas S; Glass, Elizabeth M; Chivian, Dylan; Gunter, Dan; Weston, David J; Allen, Benjamin H; Baumohl, Jason; Best, Aaron A; Bowen, Ben; Brenner, Steven E; Bun, Christopher C; Chandonia, John-Marc; Chia, Jer-Ming; Colasanti, Ric; Conrad, Neal; Davis, James J; Davison, Brian H; DeJongh, Matthew; Devoid, Scott; Dietrich, Emily; Dubchak, Inna; Edirisinghe, Janaka N; Fang, Gang; Faria, Jose P; Frybarger, Paul M; Gerlach, Wolfgang; Gerstein, Mark; Greiner, Annette; Gurtowski, James; Haun, Holly L; He, Fei; Jain, Rashmi; Joachimiak, Marcin P; Keegan, Kevin P; Kondo, Shinnosuke; Kumar, Vivek; Land, Miriam L; Meyer, Folker; Mills, Marissa; Novichkov, Pavel S; Oh, Taeyun; Olsen, Gary J; Olson, Robert; Parrello, Bruce; Pasternak, Shiran; Pearson, Erik; Poon, Sarah S; Price, Gavin A; Ramakrishnan, Srividya; Ranjan, Priya; Ronald, Pamela C; Schatz, Michael C; Seaver, Samuel MD; Shukla, Maulik; Sutormin, Roman A; Syed, Mustafa H; Thomason, James; Tintle, Nathan L; Wang, Daifeng; Xia, Fangfang; Yoo, Hyunseung; Yoo, Shinjae; Yu, Dantong

743 citations


Journal ArticleDOI
TL;DR: The fundamental properties of TEs and their complex interactions with their cellular environment are introduced, which are crucial to understanding their impact and manifold consequences for organismal biology.
Abstract: Transposable elements (TEs) are major components of eukaryotic genomes. However, the extent of their impact on genome evolution, function, and disease remain a matter of intense interrogation. The rise of genomics and large-scale functional assays has shed new light on the multi-faceted activities of TEs and implies that they should no longer be marginalized. Here, we introduce the fundamental properties of TEs and their complex interactions with their cellular environment, which are crucial to understanding their impact and manifold consequences for organismal biology. While we draw examples primarily from mammalian systems, the core concepts outlined here are relevant to a broad range of organisms.

691 citations


Journal ArticleDOI
TL;DR: A pancreatic cancer patient-derived organoid (PDO) library is generated that recapitulates the mutational spectrum and transcriptional subtypes of primary Pancreatic cancer and proposes that combined molecular and therapeutic profiling of PDOs may predict clinical response and enable prospective therapeutic selection.
Abstract: Pancreatic cancer is the most lethal common solid malignancy. Systemic therapies are often ineffective and predictive biomarkers to guide treatment are urgently needed. We generated a pancreatic cancer patient-derived organoid (PDO) library that recapitulates the mutational spectrum and transcriptional subtypes of primary pancreatic cancer. New driver oncogenes were nominated and transcriptomic analyses revealed unique clusters. PDOs exhibited heterogeneous responses to standard-of-care chemotherapeutics and investigational agents. In a case study manner, we find that PDO therapeutic profiles paralleled patient outcomes and that PDOs enable longitudinal assessment of chemo-sensitivity and evaluation of synchronous metastases. We derived organoid-based gene expression signatures of chemo-sensitivity that predicted improved responses for many patients to chemotherapy in both the adjuvant and advanced disease settings. Finally, we nominated alternative treatment strategies for chemo-refractory PDOs using targeted agent therapeutic profiling. We propose that combined molecular and therapeutic profiling of PDOs may predict clinical response and enable prospective therapeutic selection.

608 citations


Journal ArticleDOI
TL;DR: LFADS, a deep learning method for analyzing neural population activity, can extract neural dynamics from single-trial recordings, stitch separate datasets into a single model, and infer perturbations, for example, from behavioral choices to these dynamics.
Abstract: Neuroscience is experiencing a revolution in which simultaneous recording of thousands of neurons is revealing population dynamics that are not apparent from single-neuron responses. This structure is typically extracted from data averaged across many trials, but deeper understanding requires studying phenomena detected in single trials, which is challenging due to incomplete sampling of the neural population, trial-to-trial variability, and fluctuations in action potential timing. We introduce latent factor analysis via dynamical systems, a deep learning method to infer latent dynamics from single-trial neural spiking data. When applied to a variety of macaque and human motor cortical datasets, latent factor analysis via dynamical systems accurately predicts observed behavioral variables, extracts precise firing rate estimates of neural dynamics on single trials, infers perturbations to those dynamics that correlate with behavioral choices, and combines data from non-overlapping recording sessions spanning months to improve inference of underlying dynamics.

455 citations


Journal ArticleDOI
TL;DR: Examples of lncRNAs that demonstrate the diversity of their function in various cancer types are discussed and recent advances in nucleic acid drug development with a focus on oligonucleotide-based therapies as a novel approach to inhibit tumor progression are discussed.

Journal ArticleDOI
TL;DR: This paper provides an update to the previous publications about the Ensembl Genomes resource, with a focus on recent developments and expansions, including the incorporation of almost 20 000 additional genome sequences and over 35 000 tracks of RNA-Seq data.
Abstract: Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including genome sequence, gene models, transcript sequence, genetic variation, and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments and expansions. These include the incorporation of almost 20 000 additional genome sequences and over 35 000 tracks of RNA-Seq data, which have been aligned to genomic sequence and made available for visualization. Other advances since 2015 include the release of the database in Resource Description Framework (RDF) format, a large increase in community-derived curation, a new high-performance protein sequence search, additional cross-references, improved annotation of non-protein-coding genes, and the launch of pre-release and archival sites. Collectively, these changes are part of a continuing response to the increasing quantity of publicly-available genome-scale data, and the consequent need to archive, integrate, annotate and disseminate these using automated, scalable methods.

Journal ArticleDOI
TL;DR: This Review discusses bioinformatics tools that have been devised to handle the numerous characteristic features of these long-range data types, with applications in genome assembly, genetic variant detection, haplotype phasing, transcriptomics and epigenomics.
Abstract: Several new genomics technologies have become available that offer long-read sequencing or long-range mapping with higher throughput and higher resolution analysis than ever before. These long-range technologies are rapidly advancing the field with improved reference genomes, more comprehensive variant identification and more complete views of transcriptomes and epigenomes. However, they also require new bioinformatics approaches to take full advantage of their unique characteristics while overcoming their complex errors and modalities. Here, we discuss several of the most important applications of the new technologies, focusing on both the currently available bioinformatics tools and opportunities for future research.

Journal ArticleDOI
TL;DR: This study resolves controversial areas of the Oryza phylogeny, showing a complex history of introgression among different chromosomes in the young ‘AA’ subclade containing the two domesticated species and announcing many new haplotypes of potential use for future crop protection.
Abstract: The genus Oryza is a model system for the study of molecular evolution over time scales ranging from a few thousand to 15 million years. Using 13 reference genomes spanning the Oryza species tree, we show that despite few large-scale chromosomal rearrangements rapid species diversification is mirrored by lineage-specific emergence and turnover of many novel elements, including transposons, and potential new coding and noncoding genes. Our study resolves controversial areas of the Oryza phylogeny, showing a complex history of introgression among different chromosomes in the young 'AA' subclade containing the two domesticated species. This study highlights the prevalence of functionally coupled disease resistance genes and identifies many new haplotypes of potential use for future crop protection. Finally, this study marks a milestone in modern rice research with the release of a complete long-read assembly of IR 8 'Miracle Rice', which relieved famine and drove the Green Revolution in Asia 50 years ago.


Journal ArticleDOI
TL;DR: The maize haplotype version 3 (HapMap 3) was built from whole-genome sequencing data from 1218 maize lines, covering predomestication and domesticated Zea mays varieties across the world as discussed by the authors.
Abstract: Author(s): Bukowski, Robert; Guo, Xiaosen; Lu, Yanli; Zou, Cheng; He, Bing; Rong, Zhengqin; Wang, Bo; Xu, Dawen; Yang, Bicheng; Xie, Chuanxiao; Fan, Longjiang; Gao, Shibin; Xu, Xun; Zhang, Gengyun; Li, Yingrui; Jiao, Yinping; Doebley, John F; Ross-Ibarra, Jeffrey; Lorant, Anne; Buffalo, Vince; Romay, M Cinta; Buckler, Edward S; Ware, Doreen; Lai, Jinsheng; Sun, Qi; Xu, Yunbi | Abstract: BackgroundCharacterization of genetic variations in maize has been challenging, mainly due to deterioration of collinearity between individual genomes in the species. An international consortium of maize research groups combined resources to develop the maize haplotype version 3 (HapMap 3), built from whole-genome sequencing data from 1218 maize lines, covering predomestication and domesticated Zea mays varieties across the world.ResultsA new computational pipeline was set up to process more than 12 trillion bp of sequencing data, and a set of population genetics filters was applied to identify more than 83 million variant sites.ConclusionsWe identified polymorphisms in regions where collinearity is largely preserved in the maize species. However, the fact that the B73 genome used as the reference only represents a fraction of all haplotypes is still an important limiting factor.

Journal ArticleDOI
TL;DR: The relationship between NMDA receptor structure and function is reviewed with an emphasis on emerging atomic resolution structures, which begin to explain unique features of this receptor.
Abstract: NMDA-type glutamate receptors are ligand-gated ion channels that mediate a Ca2+-permeable component of excitatory neurotransmission in the central nervous system (CNS). They are expressed throughout the CNS and play key physiological roles in synaptic function, such as synaptic plasticity, learning, and memory. NMDA receptors are also implicated in the pathophysiology of several CNS disorders and more recently have been identified as a locus for disease-associated genomic variation. NMDA receptors exist as a diverse array of subtypes formed by variation in assembly of seven subunits (GluN1, GluN2A-D, and GluN3A-B) into tetrameric receptor complexes. These NMDA receptor subtypes show unique structural features that account for their distinct functional and pharmacological properties allowing precise tuning of their physiological roles. Here, we review the relationship between NMDA receptor structure and function with an emphasis on emerging atomic resolution structures, which begin to explain unique features of this receptor.

Journal ArticleDOI
TL;DR: A study developed genomic resources and efficient transformation in the orphan crop groundcherry, and managed to improve productivity traits by editing the orthologues of tomato domestication and improvement genes using CRISPR–Cas9.
Abstract: Genome editing holds great promise for increasing crop productivity, and there is particular interest in advancing breeding in orphan crops, which are often burdened by undesirable characteristics resembling wild relatives. We developed genomic resources and efficient transformation in the orphan Solanaceae crop 'groundcherry' (Physalis pruinosa) and used clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated protein-9 nuclease (Cas9) (CRISPR-Cas9) to mutate orthologues of tomato domestication and improvement genes that control plant architecture, flower production and fruit size, thereby improving these major productivity traits. Thus, translating knowledge from model crops enables rapid creation of targeted allelic diversity and novel breeding germplasm in distantly related orphan crops.

Journal ArticleDOI
TL;DR: The de novo genome assembly of maize line Mo17 and comparative analysis with other sequenced maize lines show extensive gene-order variations, which should have implications for heterosis and genome evolution.
Abstract: Maize is an important crop with a high level of genome diversity and heterosis. The genome sequence of a typical female line, B73, was previously released. Here, we report a de novo genome assembly of a corresponding male representative line, Mo17. More than 96.4% of the 2,183 Mb assembled genome can be accounted for by 362 scaffolds in ten pseudochromosomes with 38,620 annotated protein-coding genes. Comparative analysis revealed large gene-order and gene structural variations: approximately 10% of the annotated genes were mutually nonsyntenic, and more than 20% of the predicted genes had either large-effect mutations or large structural variations, which might cause considerable protein divergence between the two inbred lines. Our study provides a high-quality reference-genome sequence of an important maize germplasm, and the intraspecific gene order and gene structural variations identified should have implications for heterosis and genome evolution. The de novo genome assembly of maize line Mo17 and comparative analysis with other sequenced maize lines show extensive gene-order variations. This study provides insights into maize evolution and has implications for improving maize hybrid lines.

Journal ArticleDOI
TL;DR: Among patients with progressive, refractory, or symptomatic desmoid tumors, sorafenib significantly prolonged progression‐free survival and induced durable responses.
Abstract: Background Desmoid tumors (also referred to as aggressive fibromatosis) are connective tissue neoplasms that can arise in any anatomical location and infiltrate the mesentery, neurovascula...

Journal ArticleDOI
TL;DR: This work reengineered the sequences of BE3, BE4Gam, and xBE3 by codon optimization and incorporation of additional nuclear-localization sequences and shows that the optimized base editors mediate efficient in vivo somatic editing in the liver in adult mice.
Abstract: CRISPR base editing enables the creation of targeted single-base conversions without generating double-stranded breaks. However, the efficiency of current base editors is very low in many cell types. We reengineered the sequences of BE3, BE4Gam, and xBE3 by codon optimization and incorporation of additional nuclear-localization sequences. Our collection of optimized constitutive and inducible base-editing vector systems dramatically improves the efficiency by which single-nucleotide variants can be created. The reengineered base editors enable target modification in a wide range of mouse and human cell lines, and intestinal organoids. We also show that the optimized base editors mediate efficient in vivo somatic editing in the liver in adult mice.

Journal ArticleDOI
Maria Dornelas1, Laura H. Antão2, Laura H. Antão1, Faye Moyes1  +283 moreInstitutions (130)
TL;DR: The BioTIME database contains raw data on species identities and abundances in ecological assemblages through time to enable users to calculate temporal trends in biodiversity within and amongst assemblage using a broad range of metrics.
Abstract: Motivation: The BioTIME database contains raw data on species identities and abundances in ecological assemblages through time. These data enable users to calculate temporal trends in biodiversity within and amongst assemblages using a broad range of metrics. BioTIME is being developed as a community-led open-source database of biodiversity time series. Our goal is to accelerate and facilitate quantitative analysis of temporal patterns of biodiversity in the Anthropocene.Main types of variables included: The database contains 8,777,413 species abundance records, from assemblages consistently sampled for a minimum of 2 years, which need not necessarily be consecutive. In addition, the database contains metadata relating to sampling methodology and contextual information about each record.Spatial location and grain: BioTIME is a global database of 547,161 unique sampling locations spanning the marine, freshwater and terrestrial realms. Grain size varies across datasets from 0.0000000158 km(2) (158 cm(2)) to 100 km(2) (1,000,000,000,000 cm(2)).Time period and grainBio: TIME records span from 1874 to 2016. The minimal temporal grain across all datasets in BioTIME is a year.Major taxa and level of measurement: BioTIME includes data from 44,440 species across the plant and animal kingdoms, ranging from plants, plankton and terrestrial invertebrates to small and large vertebrates.

Journal ArticleDOI
TL;DR: CRISPR screening to human cancer cell lines revealed POU2F3 as a cell identity determinant and a dependency in a tuft cell-like variant of SCLC, which may reflect a previously unrecognized cell of origin or a trans-differentiation event in this disease.
Abstract: Small cell lung cancer (SCLC) is widely considered to be a tumor of pulmonary neuroendocrine cells; however, a variant form of this disease has been described that lacks neuroendocrine features. Here, we applied domain-focused CRISPR screening to human cancer cell lines to identify the transcription factor (TF) POU2F3 (POU class 2 homeobox 3; also known as SKN-1a/OCT-11) as a powerful dependency in a subset of SCLC lines. An analysis of human SCLC specimens revealed that POU2F3 is expressed exclusively in variant SCLC tumors that lack expression of neuroendocrine markers and instead express markers of a chemosensory lineage known as tuft cells. Using chromatin- and RNA-profiling experiments, we provide evidence that POU2F3 is a master regulator of tuft cell identity in a variant form of SCLC. Moreover, we show that most SCLC tumors can be classified into one of three lineages based on the expression of POU2F3, ASCL1, or NEUROD1. Our CRISPR screens exposed other unique dependencies in POU2F3-expressing SCLC lines, including the lineage TFs SOX9 and ASCL2 and the receptor tyrosine kinase IGF1R (insulin-like growth factor 1 receptor). These data reveal POU2F3 as a cell identity determinant and a dependency in a tuft cell-like variant of SCLC, which may reflect a previously unrecognized cell of origin or a trans-differentiation event in this disease.

Journal ArticleDOI
05 Apr 2018-Nature
TL;DR: The results indicate that the dominant mode of intracortical information transfer is not based on ‘one neuron–one target area’ mapping, and instead, signals carried by individual cortical neurons are shared across subsets of target areas, and thus concurrently contribute to multiple functional pathways.
Abstract: Neocortical areas communicate through extensive axonal projections, but the logic of information transfer remains poorly understood, because the projections of individual neurons have not been systematically characterized. It is not known whether individual neurons send projections only to single cortical areas or distribute signals across multiple targets. Here we determine the projection patterns of 591 individual neurons in the mouse primary visual cortex using whole-brain fluorescence-based axonal tracing and high-throughput DNA sequencing of genetically barcoded neurons (MAPseq). Projections were highly diverse and divergent, collectively targeting at least 18 cortical and subcortical areas. Most neurons targeted multiple cortical areas, often in non-random combinations, suggesting that sub-classes of intracortical projection neurons exist. Our results indicate that the dominant mode of intracortical information transfer is not based on 'one neuron-one target area' mapping. Instead, signals carried by individual cortical neurons are shared across subsets of target areas, and thus concurrently contribute to multiple functional pathways.

Journal ArticleDOI
Andrew J. Aguirre, Jonathan A. Nowak1, Jonathan A. Nowak2, Nicholas D. Camarda, Richard A. Moffitt3, Arezou A. Ghazani4, Arezou A. Ghazani2, Arezou A. Ghazani1, Mehlika Hazar-Rethinam2, Srivatsan Raghavan, Jaegil Kim4, Lauren K. Brais2, Dorisanne Y. Ragon2, Marisa W. Welch2, Emma Reilly2, Devin McCabe, Lori Marini1, Lori Marini2, Kristin Anderka4, Karla Helvie2, Karla Helvie1, Nelly Oliver1, Nelly Oliver2, Ana Babic2, Annacarolina da Silva2, Annacarolina da Silva1, Brandon Nadres2, Emily E. Van Seventer2, Heather A. Shahzade2, Joseph P. St. Pierre2, Kelly P. Burke2, Kelly P. Burke1, Thomas E. Clancy1, Thomas E. Clancy2, James M. Cleary2, James M. Cleary1, Leona A. Doyle1, Leona A. Doyle2, Kunal Jajoo1, Kunal Jajoo2, Nadine Jackson McCleary2, Nadine Jackson McCleary1, Jeffrey A. Meyerhardt1, Jeffrey A. Meyerhardt2, Janet E. Murphy2, Kimmie Ng2, Kimmie Ng1, Anuj K. Patel1, Anuj K. Patel2, Kimberly Perez2, Kimberly Perez1, Michael H. Rosenthal1, Michael H. Rosenthal2, Douglas A. Rubinson2, Douglas A. Rubinson1, Marvin Ryou2, Marvin Ryou1, Geoffrey I. Shapiro1, Geoffrey I. Shapiro2, Ewa Sicinska2, Stuart G. Silverman1, Stuart G. Silverman2, Rebecca J. Nagy, Richard B. Lanman, Deborah Knoerzer, Dean Welsch, Matthew B. Yurgelun2, Matthew B. Yurgelun1, Charles S. Fuchs, Levi A. Garraway, Gad Getz4, Gad Getz2, Jason L. Hornick1, Jason L. Hornick2, Bruce E. Johnson, Matthew H. Kulke1, Matthew H. Kulke2, Robert J. Mayer2, Robert J. Mayer1, Jeffrey W. Miller2, Paul B. Shyn1, Paul B. Shyn2, David A. Tuveson5, Nikhil Wagle, Jen Jen Yeh6, William C. Hahn, Ryan B. Corcoran2, Scott L. Carter, Brian M. Wolpin1, Brian M. Wolpin2 
TL;DR: Using an integrated multidisciplinary biopsy program, it is demonstrated that real-time genomic characterization of advanced PDAC can identify clinically relevant alterations that inform management of this difficult disease.
Abstract: Clinically relevant subtypes exist for pancreatic ductal adenocarcinoma (PDAC), but molecular characterization is not yet standard in clinical care. We implemented a biopsy protocol to perform time-sensitive whole exome sequencing and RNA-sequencing for patients with advanced PDAC. Therapeutically relevant genomic alterations were identified in 48% (34/71) and pathogenic/likely pathogenic germline alterations in 18% (13/71) of patients. Overall, 30% (21/71) of enrolled patients experienced a change in clinical management as a result of genomic data. Twenty-six patients had germline and/or somatic alterations in DNA-damage repair genes, and 5 additional patients had mutational signatures of homologous recombination deficiency but no identified causal genomic alteration. Two patients had oncogenic in-frame BRAF deletions, and we report the first clinical evidence that this alteration confers sensitivity to MAP-kinase pathway inhibition. Moreover, we identified tumor/stroma gene expression signatures with clinical relevance. Collectively, these data demonstrate the feasibility and value of real-time genomic characterization of advanced PDAC.

Journal ArticleDOI
TL;DR: This work develops MetaNeighbor for measuring cell-type replication across datasets, and uses it to identify marker genes for neuron subtypes with evidence of replication, and finds that large sets of variably expressed genes can identify replicable cell types with high accuracy.
Abstract: Single-cell RNA-sequencing (scRNA-seq) technology provides a new avenue to discover and characterize cell types; however, the experiment-specific technical biases and analytic variability inherent to current pipelines may undermine its replicability. Meta-analysis is further hampered by the use of ad hoc naming conventions. Here we demonstrate our replication framework, MetaNeighbor, that quantifies the degree to which cell types replicate across datasets, and enables rapid identification of clusters with high similarity. We first measure the replicability of neuronal identity, comparing results across eight technically and biologically diverse datasets to define best practices for more complex assessments. We then apply this to novel interneuron subtypes, finding that 24/45 subtypes have evidence of replication, which enables the identification of robust candidate marker genes. Across tasks we find that large sets of variably expressed genes can identify replicable cell types with high accuracy, suggesting a general route forward for large-scale evaluation of scRNA-seq data.

Journal ArticleDOI
24 Oct 2018-Nature
TL;DR: The yeast one-hybrid network for nitrogen-associated metabolism in Arabidopsis reveals the transcription factors that regulate the architecture of root and shoot systems under conditions of changing nitrogen availability.
Abstract: Nitrogen is an essential macronutrient for plant growth and basic metabolic processes. The application of nitrogen-containing fertilizer increases yield, which has been a substantial factor in the green revolution1. Ecologically, however, excessive application of fertilizer has disastrous effects such as eutrophication2. A better understanding of how plants regulate nitrogen metabolism is critical to increase plant yield and reduce fertilizer overuse. Here we present a transcriptional regulatory network and twenty-one transcription factors that regulate the architecture of root and shoot systems in response to changes in nitrogen availability. Genetic perturbation of a subset of these transcription factors revealed coordinate transcriptional regulation of enzymes involved in nitrogen metabolism. Transcriptional regulators in the network are transcriptionally modified by feedback via genetic perturbation of nitrogen metabolism. The network, genes and gene-regulatory modules identified here will prove critical to increasing agricultural productivity.

Journal ArticleDOI
TL;DR: It is shown that significant structural heterogeneity exists in comparison to the B73 reference genome at multiple scales, from transposon composition and copy number variation to single-nucleotide polymorphisms.
Abstract: The maize W22 inbred has served as a platform for maize genetics since the mid twentieth century. To streamline maize genome analyses, we have sequenced and de novo assembled a W22 reference genome using short-read sequencing technologies. We show that significant structural heterogeneity exists in comparison to the B73 reference genome at multiple scales, from transposon composition and copy number variation to single-nucleotide polymorphisms. The generation of this reference genome enables accurate placement of thousands of Mutator (Mu) and Dissociation (Ds) transposable element insertions for reverse and forward genetics studies. Annotation of the genome has been achieved using RNA-seq analysis, differential nuclease sensitivity profiling and bisulfite sequencing to map open reading frames, open chromatin sites and DNA methylation profiles, respectively. Collectively, the resources developed here integrate W22 as a community reference genome for functional genomics and provide a foundation for the maize pan-genome.

Journal ArticleDOI
TL;DR: Analysis of GTEx, cancer and autism data sets shows that cis-regulatory variation can modify the penetrance of coding variants, demonstrating that joint regulatory and coding variant effects are an important part of the genetic architecture of human traits and contribute to modified penetraterance of disease-causing variants.
Abstract: Coding variants represent many of the strongest associations between genotype and phenotype; however, they exhibit inter-individual differences in effect, termed 'variable penetrance'. Here, we study how cis-regulatory variation modifies the penetrance of coding variants. Using functional genomic and genetic data from the Genotype-Tissue Expression Project (GTEx), we observed that in the general population, purifying selection has depleted haplotype combinations predicted to increase pathogenic coding variant penetrance. Conversely, in cancer and autism patients, we observed an enrichment of penetrance increasing haplotype configurations for pathogenic variants in disease-implicated genes, providing evidence that regulatory haplotype configuration of coding variants affects disease risk. Finally, we experimentally validated this model by editing a Mendelian single-nucleotide polymorphism (SNP) using CRISPR/Cas9 on distinct expression haplotypes with the transcriptome as a phenotypic readout. Our results demonstrate that joint regulatory and coding variant effects are an important part of the genetic architecture of human traits and contribute to modified penetrance of disease-causing variants.

Journal ArticleDOI
TL;DR: To infer the shape of such global epistasis in three proteins, based on published high-throughput mutagenesis data, a maximum-likelihood inference procedure is developed using a flexible family of monotonic nonlinear functions spanned by an I-spline basis.
Abstract: Genotype-phenotype relationships are notoriously complicated. Idiosyncratic interactions between specific combinations of mutations occur and are difficult to predict. Yet it is increasingly clear that many interactions can be understood in terms of global epistasis. That is, mutations may act additively on some underlying, unobserved trait, and this trait is then transformed via a nonlinear function to the observed phenotype as a result of subsequent biophysical and cellular processes. Here we infer the shape of such global epistasis in three proteins, based on published high-throughput mutagenesis data. To do so, we develop a maximum-likelihood inference procedure using a flexible family of monotonic nonlinear functions spanned by an I-spline basis. Our analysis uncovers dramatic nonlinearities in all three proteins; in some proteins a model with global epistasis accounts for virtually all of the measured variation, whereas in others we find substantial local epistasis as well. This method allows us to test hypotheses about the form of global epistasis and to distinguish variance components attributable to global epistasis, local epistasis, and measurement error.