Showing papers in "Nature Methods in 2020"
••
University of Jyväskylä1, University of California, Los Angeles2, California Polytechnic State University3, Los Alamos National Laboratory4, National Research University – Higher School of Economics5, University of California, Berkeley6, University of Birmingham7, Australian Nuclear Science and Technology Organisation8, University of Washington9, University of Massachusetts Amherst10, University of West Bohemia11, University of Texas at Austin12, Brigham Young University13, Universidade Federal de Minas Gerais14, Google15
TL;DR: SciPy as discussed by the authors is an open-source scientific computing library for the Python programming language, which has become a de facto standard for leveraging scientific algorithms in Python, with over 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories and millions of downloads per year.
Abstract: SciPy is an open-source scientific computing library for the Python programming language. Since its initial release in 2001, SciPy has become a de facto standard for leveraging scientific algorithms in Python, with over 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories and millions of downloads per year. In this work, we provide an overview of the capabilities and development practices of SciPy 1.0 and highlight some recent technical developments.
6,244 citations
••
TL;DR: A long-read assembler wtdbg2 is developed that is 2–17 times as fast as published tools while achieving comparable contiguity and accuracy, and is several times faster, especially for large genomes.
Abstract: Existing long-read assemblers require thousands of central processing unit hours to assemble a human genome and are being outpaced by sequencing technologies in terms of both throughput and cost. We developed a long-read assembler wtdbg2 (https://github.com/ruanjue/wtdbg2) that is 2–17 times as fast as published tools while achieving comparable contiguity and accuracy. It paves the way for population-scale long-read assembly in future. Wtdbg2 assembles genomes with comparable contiguity and accuracy to existing tools using long-read sequencing data, and is several times faster, especially for large genomes.
783 citations
••
TL;DR: NicheNet is presented, a method that predicts ligand–target links between interacting cells by combining their expression data with prior knowledge on signaling and gene regulatory networks, and can infer active ligands and their gene regulatory effects on interacting cells.
Abstract: Computational methods that model how gene expression of a cell is influenced by interacting cells are lacking. We present NicheNet (https://github.com/saeyslab/nichenetr), a method that predicts ligand-target links between interacting cells by combining their expression data with prior knowledge on signaling and gene regulatory networks. We applied NicheNet to tumor and immune cell microenvironment data and demonstrate that NicheNet can infer active ligands and their gene regulatory effects on interacting cells.
681 citations
••
TL;DR: Non-uniform refinement, an algorithm based on cross-validation optimization, is introduced, which automatically regularizes 3D density maps during refinement to account for spatial variability and yields dramatically improved resolution and 3D map quality in many cases.
Abstract: Cryogenic electron microscopy (cryo-EM) is widely used to study biological macromolecules that comprise regions with disorder, flexibility or partial occupancy. For example, membrane proteins are often kept in solution with detergent micelles and lipid nanodiscs that are locally disordered. Such spatial variability negatively impacts computational three-dimensional (3D) reconstruction with existing iterative refinement algorithms that assume rigidity. We introduce non-uniform refinement, an algorithm based on cross-validation optimization, which automatically regularizes 3D density maps during refinement to account for spatial variability. Unlike common shift-invariant regularizers, non-uniform refinement systematically removes noise from disordered regions, while retaining signal useful for aligning particle images, yielding dramatically improved resolution and 3D map quality in many cases. We obtain high-resolution reconstructions for multiple membrane proteins as small as 100 kDa, demonstrating increased effectiveness of cryo-EM for this class of targets critical in structural biology and drug discovery. Non-uniform refinement is implemented in the cryoSPARC software package. Membrane proteins exhibit spatial variation in rigidity and disorder, which poses a challenge for traditional cryo-EM reconstruction algorithms. Non-uniform refinement accounts for this spatial variability, yielding improved 3D reconstruction quality even for small membrane proteins.
620 citations
••
University of Jyväskylä1, University of California, Los Angeles2, California Polytechnic State University3, Los Alamos National Laboratory4, National Research University – Higher School of Economics5, University of California, Berkeley6, University of Birmingham7, Australian Nuclear Science and Technology Organisation8, University of Washington9, University of Massachusetts Amherst10, University of West Bohemia11, Brigham Young University12, University of Texas at Austin13, Universidade Federal de Minas Gerais14, Google15
TL;DR: An amendment to this paper has been published and can be accessed via a link at the top of the paper.
Abstract: An amendment to this paper has been published and can be accessed via a link at the top of the paper.
617 citations
••
TL;DR: DIA-NN improves the identification and quantification performance in conventional DIA proteomic applications, and is particularly beneficial for high-throughput applications, as it is fast and enables deep and confident proteome coverage when used in combination with fast chromatographic methods.
Abstract: We present an easy-to-use integrated software suite, DIA-NN, that exploits deep neural networks and new quantification and signal correction strategies for the processing of data-independent acquisition (DIA) proteomics experiments. DIA-NN improves the identification and quantification performance in conventional DIA proteomic applications, and is particularly beneficial for high-throughput applications, as it is fast and enables deep and confident proteome coverage when used in combination with fast chromatographic methods.
584 citations
••
University of Montana1, University of California, San Diego2, University of Münster3, University of Jena4, University of Lübeck5, Statens Serum Institut6, University of Tübingen7, University of Geneva8, Bruker9, Paris Descartes University10, University of São Paulo11, Technical University of Berlin12, Georgia Institute of Technology13, Saint Petersburg State University14, Waters Corporation15, Academy of Sciences of the Czech Republic16, Sookmyung Women's University17, University of Grenoble18, University of Oklahoma19, Carnegie Mellon University20, University of West Alabama21, Leibniz Association22, University of Corsica Pascal Paoli23, Massachusetts Institute of Technology24, Michigan State University25, University of Glasgow26, Wageningen University and Research Centre27, Kangwon National University28
TL;DR: Feature-based molecular networking (FBMN) as discussed by the authors is an analysis method in the Global Natural Products Social Molecular Networking (GNPS) infrastructure that builds on chromatographic feature detection and alignment tools.
Abstract: Molecular networking has become a key method to visualize and annotate the chemical space in non-targeted mass spectrometry data. We present feature-based molecular networking (FBMN) as an analysis method in the Global Natural Products Social Molecular Networking (GNPS) infrastructure that builds on chromatographic feature detection and alignment tools. FBMN enables quantitative analysis and resolution of isomers, including from ion mobility spectrometry.
497 citations
••
New York University1, Johns Hopkins University2, University of Washington3, University of North Carolina at Chapel Hill4, Duke University5, Scripps Research Institute6, Hebrew University of Jerusalem7, Ohio State University8, University of California, San Francisco9, Baylor College of Medicine10, École Polytechnique Fédérale de Lausanne11, Vanderbilt University12, Rutgers University13, Swiss Institute of Bioinformatics14, Fred Hutchinson Cancer Research Center15, Rensselaer Polytechnic Institute16, Northeastern University17, Stanford University18, DSM19, Fox Chase Cancer Center20, University of Maryland, College Park21, University of Warsaw22, University of Denver23, Australian National University24, University of Kansas25, University of Zurich26, University of Massachusetts Dartmouth27, University of Tokyo28, Franklin & Marshall College29, Weizmann Institute of Science30, Lund University31, University of California, Santa Cruz32, University of California, Davis33
TL;DR: This Perspective reviews tools developed over the past five years in the Rosetta software, including over 80 methods, and discusses improvements to the score function, user interfaces and usability.
Abstract: The Rosetta software for macromolecular modeling, docking and design is extensively used in laboratories worldwide. During two decades of development by a community of laboratories at more than 60 institutions, Rosetta has been continuously refactored and extended. Its advantages are its performance and interoperability between broad modeling capabilities. Here we review tools developed in the last 5 years, including over 80 methods. We discuss improvements to the score function, user interfaces and usability. Rosetta is available at http://www.rosettacommons.org.
430 citations
••
TL;DR: MaSIF (molecular surface interaction fingerprinting) is presented, a conceptual framework based on a geometric deep learning method to capture fingerprints that are important for specific biomolecular interactions that will lead to improvements in the understanding of protein function and design.
Abstract: Predicting interactions between proteins and other biomolecules solely based on structure remains a challenge in biology. A high-level representation of protein structure, the molecular surface, displays patterns of chemical and geometric features that fingerprint a protein's modes of interactions with other biomolecules. We hypothesize that proteins participating in similar interactions may share common fingerprints, independent of their evolutionary history. Fingerprints may be difficult to grasp by visual analysis but could be learned from large-scale datasets. We present MaSIF (molecular surface interaction fingerprinting), a conceptual framework based on a geometric deep learning method to capture fingerprints that are important for specific biomolecular interactions. We showcase MaSIF with three prediction challenges: protein pocket-ligand prediction, protein-protein interaction site prediction and ultrafast scanning of protein surfaces for prediction of protein-protein complexes. We anticipate that our conceptual framework will lead to improvements in our understanding of protein function and design.
389 citations
••
Fred Hutchinson Cancer Research Center1, University of Cambridge2, Genentech3, Brigham and Women's Hospital4, City University of New York5, University of Oxford6, Cornell University7, University of Padua8, Swiss Institute of Bioinformatics9, Friedrich Miescher Institute for Biomedical Research10, Roswell Park Cancer Institute11, Johns Hopkins University12
TL;DR: This Perspective highlights open-source software for single-cell analysis released as part of the Bioconductor project, providing an overview for users and developers.
Abstract: Recent technological advancements have enabled the profiling of a large number of genome-wide features in individual cells. However, single-cell data present unique challenges that require the development of specialized methods and software infrastructure to successfully derive biological insights. The Bioconductor project has rapidly grown to meet these demands, hosting community-developed open-source software distributed as R packages. Featuring state-of-the-art computational methods, standardized data infrastructure and interactive data visualization tools, we present an overview and online book (https://osca.bioconductor.org) of single-cell methods for prospective users.
332 citations
••
TL;DR: A systematic evaluation of state-of-the-art algorithms for inferring gene regulatory networks from single-cell transcriptional data finds heterogeneous performance and suggests recommendations to users.
Abstract: We present a systematic evaluation of state-of-the-art algorithms for inferring gene regulatory networks from single-cell transcriptional data. As the ground truth for assessing accuracy, we use synthetic networks with predictable trajectories, literature-curated Boolean models and diverse transcriptional regulatory networks. We develop a strategy to simulate single-cell transcriptional data from synthetic and Boolean networks that avoids pitfalls of previously used methods. Furthermore, we collect networks from multiple experimental single-cell RNA-seq datasets. We develop an evaluation framework called BEELINE. We find that the area under the precision-recall curve and early precision of the algorithms are moderate. The methods are better in recovering interactions in synthetic networks than Boolean models. The algorithms with the best early precision values for Boolean models also perform well on experimental datasets. Techniques that do not require pseudotime-ordered cells are generally more accurate. Based on these results, we present recommendations to end users. BEELINE will aid the development of gene regulatory network inference algorithms.
••
TL;DR: MetaFlye is presented, which addresses important long-read metagenomic assembly challenges, such as uneven bacterial composition and intra-species heterogeneity, and benchmarked metaFlye using simulated and mock bacterial communities and show that it consistently produces assemblies with better completeness and contiguity than state-of-the-art long- read assemblers.
Abstract: Long-read sequencing technologies have substantially improved the assemblies of many isolate bacterial genomes as compared to fragmented short-read assemblies. However, assembling complex metagenomic datasets remains difficult even for state-of-the-art long-read assemblers. Here we present metaFlye, which addresses important long-read metagenomic assembly challenges, such as uneven bacterial composition and intra-species heterogeneity. First, we benchmarked metaFlye using simulated and mock bacterial communities and show that it consistently produces assemblies with better completeness and contiguity than state-of-the-art long-read assemblers. Second, we performed long-read sequencing of the sheep microbiome and applied metaFlye to reconstruct 63 complete or nearly complete bacterial genomes within single contigs. Finally, we show that long-read assembly of human microbiomes enables the discovery of full-length biosynthetic gene clusters that encode biomedically important natural products.
••
TL;DR: It is shown that by localizing individual switchable fluorophores with a probing donut-shaped excitation beam, MINFLUX nanoscopy can provide resolutions in the range of 1 to 3 nm for structures in fixed and living cells.
Abstract: The ultimate goal of biological super-resolution fluorescence microscopy is to provide three-dimensional resolution at the size scale of a fluorescent marker. Here we show that by localizing individual switchable fluorophores with a probing donut-shaped excitation beam, MINFLUX nanoscopy can provide resolutions in the range of 1 to 3 nm for structures in fixed and living cells. This progress has been facilitated by approaching each fluorophore iteratively with the probing-donut minimum, making the resolution essentially uniform and isotropic over scalable fields of view. MINFLUX imaging of nuclear pore complexes of a mammalian cell shows that this true nanometer-scale resolution is obtained in three dimensions and in two color channels. Relying on fewer detected photons than standard camera-based localization, MINFLUX nanoscopy is poised to open a new chapter in the imaging of protein complexes and distributions in fixed and living cells. Advances in MINFLUX nanoscopy enable multicolor imaging over large fields of view, bringing true nanometer-scale fluorescence imaging to labeled structures in fixed and living cells.
••
TL;DR: This work uses the correlation of molecular weight and ion mobility in a trapped ion mobility device to devise a scan mode that samples up to 100% of the peptide precursor ion current in m/z and mobility windows and thereby increase the specificity for precursor identification.
Abstract: Data-independent acquisition modes isolate and concurrently fragment populations of different precursors by cycling through segments of a predefined precursor m/z range. Although these selection windows collectively cover the entire m/z range, overall, only a few per cent of all incoming ions are isolated for mass analysis. Here, we make use of the correlation of molecular weight and ion mobility in a trapped ion mobility device (timsTOF Pro) to devise a scan mode that samples up to 100% of the peptide precursor ion current in m/z and mobility windows. We extend an established targeted data extraction workflow by inclusion of the ion mobility dimension for both signal extraction and scoring and thereby increase the specificity for precursor identification. Data acquired from whole proteome digests and mixed organism samples demonstrate deep proteome coverage and a high degree of reproducibility as well as quantitative accuracy, even from 10 ng sample amounts. diaPASEF makes use of the correlation between the ion mobility and the m/z of peptides to trap and release precursor ions in a TIMS-TOF mass spectrometer for an almost complete sampling of the precursor ion beam with data-independent acquisition.
••
TL;DR: The design principles and advances in fluorescence nanothermometry are introduced, application achievements are highlighted, scenarios that may lead to biased sensing are discussed, and the challenges ahead are analyzed in terms of both fundamental issues and practical implementations.
Abstract: Fluorescent nanothermometers can probe changes in local temperature in living cells and in vivo and reveal fundamental insights into biological properties. This field has attracted global efforts in developing both temperature-responsive materials and detection procedures to achieve sub-degree temperature resolution in biosystems. Recent generations of nanothermometers show superior performance to earlier ones and also offer multifunctionality, enabling state-of-the-art functional imaging with improved spatial, temporal and temperature resolutions for monitoring the metabolism of intracellular organelles and internal organs. Although progress in this field has been rapid, it has not been without controversy, as recent studies have shown possible biased sensing during fluorescence-based detection. Here, we introduce the design principles and advances in fluorescence nanothermometry, highlight application achievements, discuss scenarios that may lead to biased sensing, analyze the challenges ahead in terms of both fundamental issues and practical implementations, and point to new directions for improving this interdisciplinary field.
••
TL;DR: A set of isobaric labeling reagents called TMTpro enables deep quantitative comparisons of proteome measurements across 16 samples, and identifies and dose-stratified staurosporine binding to 228 cellular kinases in just one, 18-h experiment.
Abstract: Isobaric labeling empowers proteome-wide expression measurements simultaneously across multiple samples. Here an expanded set of 16 isobaric reagents based on an isobutyl-proline immonium ion reporter structure (TMTpro) is presented. These reagents have similar characteristics to existing tandem mass tag reagents but with increased fragmentation efficiency and signal. In a proteome-scale example dataset, we compared eight common cell lines with and without Torin1 treatment with three replicates, quantifying more than 8,800 proteins (mean of 7.5 peptides per protein) per replicate with an analysis time of only 1.1 h per proteome. Finally, we modified the thermal stability assay to examine proteome-wide melting shifts after treatment with DMSO, 1 or 20 µM staurosporine with five replicates. This assay identified and dose-stratified staurosporine binding to 228 cellular kinases in just one, 18-h experiment. TMTpro reagents allow complex experimental designs—all with essentially no missing values across the 16 samples and no loss in quantitative integrity. A set of isobaric labeling reagents called TMTpro enables deep quantitative comparisons of proteome measurements across 16 samples.
••
TL;DR: Development of single-cell multimodal omics tools is another major step toward understanding the inner workings of biological systems.
Abstract: Advances in single-cell genomics technologies have enabled investigation of the gene regulation programs of multicellular organisms at unprecedented resolution and scale. Development of single-cell multimodal omics tools is another major step toward understanding the inner workings of biological systems.
••
TL;DR: Analyzing four published spatially resolved transcriptomic datasets using SPARK shows it can be up to ten times more powerful than existing methods and disclose biological discoveries that otherwise cannot be revealed by existing approaches.
Abstract: Identifying genes that display spatial expression patterns in spatially resolved transcriptomic studies is an important first step toward characterizing the spatial transcriptomic landscape of complex tissues. Here we present a statistical method, SPARK, for identifying spatial expression patterns of genes in data generated from various spatially resolved transcriptomic techniques. SPARK directly models spatial count data through generalized linear spatial models. It relies on recently developed statistical formulas for hypothesis testing, providing effective control of type I errors and yielding high statistical power. With a computationally efficient algorithm, which is based on penalized quasi-likelihood, SPARK is also scalable to datasets with tens of thousands of genes measured on tens of thousands of samples. Analyzing four published spatially resolved transcriptomic datasets using SPARK, we show it can be up to ten times more powerful than existing methods and disclose biological discoveries that otherwise cannot be revealed by existing approaches. A statistical method called SPARK for analyzing spatially resolved transcriptomic data can efficiently identify spatially expressed genes with effective control of type I errors and high statistical power.
••
TL;DR: A density-modification procedure for improving maps from single-particle electron cryogenic microscopy (cryo-EM) improved map-model correlation and increased the visibility of details in many of the maps.
Abstract: A density-modification procedure for improving maps from single-particle electron cryogenic microscopy (cryo-EM) is presented. The theoretical basis of the method is identical to that of maximum-likelihood density modification, previously used to improve maps from macromolecular X-ray crystallography. Key differences from applications in crystallography are that the errors in Fourier coefficients are largely in the phases in crystallography but in both phases and amplitudes in cryo-EM, and that half-maps with independent errors are available in cryo-EM. These differences lead to a distinct approach for combination of information from starting maps with information obtained in the density-modification process. The density-modification procedure was applied to a set of 104 datasets and improved map-model correlation and increased the visibility of details in many of the maps. The procedure requires two unmasked half-maps and a sequence file or other source of information on the volume of the macromolecule that has been imaged.
••
TL;DR: The GRABDA sensors resolve evoked DA release in mouse brain slices, detect evoked compartmental DA release from a single neuron in live flies, and report optogenetically elicited nigrostriatal DA release as well as mesoaccumbens dopaminergic activity during sexual behavior in freely behaving mice.
Abstract: Dopamine (DA) plays a critical role in the brain, and the ability to directly measure dopaminergic activity is essential for understanding its physiological functions. We therefore developed red fluorescent G-protein-coupled receptor-activation-based DA (GRABDA) sensors and optimized versions of green fluorescent GRABDA sensors. In response to extracellular DA, both the red and green GRABDA sensors exhibit a large increase in fluorescence, with subcellular resolution, subsecond kinetics and nanomolar-to-submicromolar affinity. Moreover, the GRABDA sensors resolve evoked DA release in mouse brain slices, detect evoked compartmental DA release from a single neuron in live flies and report optogenetically elicited nigrostriatal DA release as well as mesoaccumbens dopaminergic activity during sexual behavior in freely behaving mice. Coexpressing red GRABDA with either green GRABDA or the calcium indicator GCaMP6s allows tracking of dopaminergic signaling and neuronal activity in distinct circuits in vivo.
••
TL;DR: A deep learning-based framework to quantify and analyze brain vasculature, named Vessel Segmentation & Analysis Pipeline (VesSAP), which uses a convolutional neural network with a transfer learning approach for segmentation and achieves human-level accuracy.
Abstract: Tissue clearing methods enable the imaging of biological specimens without sectioning. However, reliable and scalable analysis of large imaging datasets in three dimensions remains a challenge. Here we developed a deep learning-based framework to quantify and analyze brain vasculature, named Vessel Segmentation & Analysis Pipeline (VesSAP). Our pipeline uses a convolutional neural network (CNN) with a transfer learning approach for segmentation and achieves human-level accuracy. By using VesSAP, we analyzed the vascular features of whole C57BL/6J, CD1 and BALB/c mouse brains at the micrometer scale after registering them to the Allen mouse brain atlas. We report evidence of secondary intracranial collateral vascularization in CD1 mice and find reduced vascularization of the brainstem in comparison to the cerebrum. Thus, VesSAP enables unbiased and scalable quantifications of the angioarchitecture of cleared mouse brains and yields biological insights into the vascular function of the brain.
••
TL;DR: The Philosopher toolkit integrates high-performance algorithms and existing tools and is a dependency-free, fast and comprehensive proteomics pipeline, able to rapidly process even the most complex proteomics datasets with efficient resource management.
Abstract: To the Editor — Here we introduce Philosopher (https://philosopher.nesvilab. org), a free, open-source, versatile and robust data analysis toolkit designed to bring easy access to a powerful and comprehensive set of computational tools for shotgun proteomics data analysis. Computational analysis is a central component of any modern experiment, and mass-spectrometry-based proteomics is no exception. As technologies continue to rapidly advance with respect to throughput and sensitivity, bioinformatics tools must keep pace with large-scale experiments. While existing proteomics tools such as the Trans-Proteomic Pipeline (TPP)1, MaxQuant2 and PeptideShaker3 are capable of performing high-quality analyses, all require installation and depend on specific operating systems, libraries and other software. Managing these tools can be a daunting task, even for research groups with substantial bioinformatics expertise. This is particularly true when experiments demand high-performance configurations such as GNU/Linux clusters or cloud computing. To address this challenge, we initially built and deployed Docker containers with different applications for proteomics, which in part inspired the creation of the BioContainers resource for different bioinformatics fields4. Though this method was efficient for packing and sharing resources, we found that chaining different applications with custom implementation of established algorithms in a transparent and dependency-free way was still a challenge for containerization. The Philosopher toolkit integrates high-performance algorithms and existing tools (Fig. 1) and is a dependency-free, fast and comprehensive proteomics pipeline, able to rapidly process even the most complex proteomics datasets with efficient resource management. Philosopher includes the database search engine Comet and can use the high-performance search engine MSFragger5 as a separately downloaded tool. For downstream processing of peptide– spectrum matches (PSMs), Philosopher includes key components of TPP. In addition, it implements best practices for false discovery rate (FDR) filtering and data summarization that are not readily available within the TPP, such as picked FDR, two-dimensional or sequential (at PSM and protein levels) filters, and additional options for dealing with peptides whose sequence is present in multiple proteins (for example, the razor peptide approach). As quantification is frequently the goal of modern proteomics experiments, Philosopher includes algorithms for both label-free quantification and isobaric label-based quantification (TMT or iTRAQ). Precursor spectral intensities are retrieved following a method described previously6. Protein-level quantification is estimated using the sum of the three most intense supporting ions. Alternatively, Philosopher can use TMT-Integrator (http://tmt-integrator.nesvilab.org/) as an external tool or output files can be used with downstream quantification and statistical tools such as MSstats7. The rich reports generated by Philosopher are also compatible with other software such as PDV for visualization of peptide assignments to tandem mass spectra8 and CRAPome and REPRINT (https://reprint-apms. org/) for interactome scoring and network
••
TL;DR: DeepSTORM3D uses deep learning for accurate localization of point emitters in densely labeled samples in three dimensions for volumetric localization microscopy with high temporal resolution, as well as for optimal point-spread function design.
Abstract: An outstanding challenge in single-molecule localization microscopy is the accurate and precise localization of individual point emitters in three dimensions in densely labeled samples. One established approach for three-dimensional single-molecule localization is point-spread-function (PSF) engineering, in which the PSF is engineered to vary distinctively with emitter depth using additional optical elements. However, images of dense emitters, which are desirable for improving temporal resolution, pose a challenge for algorithmic localization of engineered PSFs, due to lateral overlap of the emitter PSFs. Here we train a neural network to localize multiple emitters with densely overlapping Tetrapod PSFs over a large axial range. We then use the network to design the optimal PSF for the multi-emitter case. We demonstrate our approach experimentally with super-resolution reconstructions of mitochondria and volumetric imaging of fluorescently labeled telomeres in cells. Our approach, DeepSTORM3D, enables the study of biological processes in whole cells at timescales that are rarely explored in localization microscopy.
••
TL;DR: Souporcell is developed, a method to cluster cells using the genetic variants detected within the scRNA-seq reads, which achieves high accuracy on genotype clustering, doublet detection and ambient RNA estimation, as demonstrated across a range of challenging scenarios.
Abstract: Methods to deconvolve single-cell RNA-sequencing (scRNA-seq) data are necessary for samples containing a mixture of genotypes, whether they are natural or experimentally combined. Multiplexing across donors is a popular experimental design that can avoid batch effects, reduce costs and improve doublet detection. By using variants detected in scRNA-seq reads, it is possible to assign cells to their donor of origin and identify cross-genotype doublets that may have highly similar transcriptional profiles, precluding detection by transcriptional profile. More subtle cross-genotype variant contamination can be used to estimate the amount of ambient RNA. Ambient RNA is caused by cell lysis before droplet partitioning and is an important confounder of scRNA-seq analysis. Here we develop souporcell, a method to cluster cells using the genetic variants detected within the scRNA-seq reads. We show that it achieves high accuracy on genotype clustering, doublet detection and ambient RNA estimation, as demonstrated across a range of challenging scenarios.
••
TL;DR: Q -score analysis of multiple cryo-EM maps of the same proteins derived from different laboratories confirms the reproducibility of structural features from side chains down to water and ion atoms, and can be used at the atom, residue or macromolecule scale.
Abstract: Cryogenic electron microscopy (cryo-EM) maps are now at the point where resolvability of individual atoms can be achieved. However, resolvability is not necessarily uniform throughout the map. We introduce a quantitative parameter to characterize the resolvability of individual atoms in cryo-EM maps, the map Q-score. Q-scores can be calculated for atoms in proteins, nucleic acids, water, ligands and other solvent atoms, using models fitted to or derived from cryo-EM maps. Q-scores can also be averaged to represent larger features such as entire residues and nucleotides. Averaged over entire models, Q-scores correlate very well with the estimated resolution of cryo-EM maps for both protein and RNA. Assuming the models they are calculated from are well fitted to the map, Q-scores can be used as a measure of resolvability in cryo-EM maps at various scales, from entire macromolecules down to individual atoms. Q-score analysis of multiple cryo-EM maps of the same proteins derived from different laboratories confirms the reproducibility of structural features from side chains down to water and ion atoms.
••
TL;DR: This work introduces probabilistic cell typing by in situ sequencing (pciSeq), an approach that leverages previous scRNA-seq classification to identify cell types using multiplexed in situ RNA detection to spatially map cell types accurately in the mouse hippocampus and isocortex.
Abstract: Understanding the function of a tissue requires knowing the spatial organization of its constituent cell types. In the cerebral cortex, single-cell RNA sequencing (scRNA-seq) has revealed the genome-wide expression patterns that define its many, closely related neuronal types, but cannot reveal their spatial arrangement. Here we introduce probabilistic cell typing by in situ sequencing (pciSeq), an approach that leverages previous scRNA-seq classification to identify cell types using multiplexed in situ RNA detection. We applied this method by mapping the inhibitory neurons of mouse hippocampal area CA1, for which ground truth is available from extensive previous work identifying their laminar organization. Our method identified these neuronal classes in a spatial arrangement matching ground truth, and further identified multiple classes of isocortical pyramidal cell in a pattern matching their known organization. This method will allow identifying the spatial organization of closely related cell types across the brain and other tissues.
••
TL;DR: The GRAB ACh (GPCR-activation-based ACh) sensor is optimized to achieve substantially improved sensitivity in ACh detection, as well as reduced downstream coupling to intracellular pathways.
Abstract: The ability to directly measure acetylcholine (ACh) release is an essential step toward understanding its physiological function. Here we optimized the GRABACh (GPCR-activation-based ACh) sensor to achieve substantially improved sensitivity in ACh detection, as well as reduced downstream coupling to intracellular pathways. The improved version of the ACh sensor retains the subsecond response kinetics, physiologically relevant affinity and precise molecular specificity for ACh of its predecessor. Using this sensor, we revealed compartmental ACh signals in the olfactory center of transgenic flies in response to external stimuli including odor and body shock. Using fiber photometry recording and two-photon imaging, our ACh sensor also enabled sensitive detection of single-trial ACh dynamics in multiple brain regions in mice performing a variety of behaviors.
••
TL;DR: Approaches of Acr proteins for post-translational control of CRISPR-Cas systems in prokaryotic and mammalian cells, organisms and ecosystems are discussed.
Abstract: Clustered, regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) genes, a diverse family of prokaryotic adaptive immune systems, have emerged as a biotechnological tool and therapeutic. The discovery of protein inhibitors of CRISPR-Cas systems, called anti-CRISPR (Acr) proteins, enables the development of more controllable and precise CRISPR-Cas tools. Here we discuss applications of Acr proteins for post-translational control of CRISPR-Cas systems in prokaryotic and mammalian cells, organisms and ecosystems.
••
TL;DR: Computational methods for analysis and integration of single-cell omics data across different modalities are summarized and their applications, challenges and future directions are discussed.
Abstract: Single-cell omics approaches provide high-resolution data on cellular phenotypes, developmental dynamics and communication networks in diverse tissues and conditions. Emerging technologies now measure different modalities of individual cells, such as genomes, epigenomes, transcriptomes and proteomes, in addition to spatial profiling. Combined with analytical approaches, these data open new avenues for accurate reconstruction of gene-regulatory and signaling networks driving cellular identity and function. Here we summarize computational methods for analysis and integration of single-cell omics data across different modalities and discuss their applications, challenges and future directions.
••
TL;DR: A convolutional neural network, Akita, is presented that accurately predicts genome folding from DNA sequence alone and can be used to perform in silico saturation mutagenesis, interpret eQTLs, make predictions for structural variants, and probe species-specific genome folding.
Abstract: In interphase, the human genome sequence folds in three dimensions into a rich variety of locus-specific contact patterns. Cohesin and CTCF (CCCTC-binding factor) are key regulators; perturbing the levels of either greatly disrupts genome-wide folding as assayed by chromosome conformation capture methods. Still, how a given DNA sequence encodes a particular locus-specific folding pattern remains unknown. Here we present a convolutional neural network, Akita, that accurately predicts genome folding from DNA sequence alone. Representations learned by Akita underscore the importance of an orientation-specific grammar for CTCF binding sites. Akita learns predictive nucleotide-level features of genome folding, revealing effects of nucleotides beyond the core CTCF motif. Once trained, Akita enables rapid in silico predictions. Accounting for this, we demonstrate how Akita can be used to perform in silico saturation mutagenesis, interpret eQTLs, make predictions for structural variants and probe species-specific genome folding. Collectively, these results enable decoding genome function from sequence through structure.