scispace - formally typeset
Search or ask a question

Showing papers in "Nature Biotechnology in 2019"


Journal ArticleDOI
Evan Bolyen1, Jai Ram Rideout1, Matthew R. Dillon1, Nicholas A. Bokulich1, Christian C. Abnet2, Gabriel A. Al-Ghalith3, Harriet Alexander4, Harriet Alexander5, Eric J. Alm6, Manimozhiyan Arumugam7, Francesco Asnicar8, Yang Bai9, Jordan E. Bisanz10, Kyle Bittinger11, Asker Daniel Brejnrod7, Colin J. Brislawn12, C. Titus Brown4, Benjamin J. Callahan13, Andrés Mauricio Caraballo-Rodríguez14, John Chase1, Emily K. Cope1, Ricardo Silva14, Christian Diener15, Pieter C. Dorrestein14, Gavin M. Douglas16, Daniel M. Durall17, Claire Duvallet6, Christian F. Edwardson, Madeleine Ernst18, Madeleine Ernst14, Mehrbod Estaki17, Jennifer Fouquier19, Julia M. Gauglitz14, Sean M. Gibbons15, Sean M. Gibbons20, Deanna L. Gibson17, Antonio Gonzalez14, Kestrel Gorlick1, Jiarong Guo21, Benjamin Hillmann3, Susan Holmes22, Hannes Holste14, Curtis Huttenhower23, Curtis Huttenhower24, Gavin A. Huttley25, Stefan Janssen26, Alan K. Jarmusch14, Lingjing Jiang14, Benjamin D. Kaehler25, Benjamin D. Kaehler27, Kyo Bin Kang14, Kyo Bin Kang28, Christopher R. Keefe1, Paul Keim1, Scott T. Kelley29, Dan Knights3, Irina Koester14, Tomasz Kosciolek14, Jorden Kreps1, Morgan G. I. Langille16, Joslynn S. Lee30, Ruth E. Ley31, Ruth E. Ley32, Yong-Xin Liu, Erikka Loftfield2, Catherine A. Lozupone19, Massoud Maher14, Clarisse Marotz14, Bryan D Martin20, Daniel McDonald14, Lauren J. McIver23, Lauren J. McIver24, Alexey V. Melnik14, Jessica L. Metcalf33, Sydney C. Morgan17, Jamie Morton14, Ahmad Turan Naimey1, Jose A. Navas-Molina14, Jose A. Navas-Molina34, Louis-Félix Nothias14, Stephanie B. Orchanian, Talima Pearson1, Samuel L. Peoples35, Samuel L. Peoples20, Daniel Petras14, Mary L. Preuss36, Elmar Pruesse19, Lasse Buur Rasmussen7, Adam R. Rivers37, Michael S. Robeson38, Patrick Rosenthal36, Nicola Segata8, Michael Shaffer19, Arron Shiffer1, Rashmi Sinha2, Se Jin Song14, John R. Spear39, Austin D. Swafford, Luke R. Thompson40, Luke R. Thompson41, Pedro J. Torres29, Pauline Trinh20, Anupriya Tripathi14, Peter J. Turnbaugh10, Sabah Ul-Hasan42, Justin J. J. van der Hooft43, Fernando Vargas, Yoshiki Vázquez-Baeza14, Emily Vogtmann2, Max von Hippel44, William A. Walters32, Yunhu Wan2, Mingxun Wang14, Jonathan Warren45, Kyle C. Weber37, Kyle C. Weber46, Charles H. D. Williamson1, Amy D. Willis20, Zhenjiang Zech Xu14, Jesse R. Zaneveld20, Yilong Zhang47, Qiyun Zhu14, Rob Knight14, J. Gregory Caporaso1 
TL;DR: QIIME 2 development was primarily funded by NSF Awards 1565100 to J.G.C. and R.K.P. and partial support was also provided by the following: grants NIH U54CA143925 and U54MD012388.
Abstract: QIIME 2 development was primarily funded by NSF Awards 1565100 to J.G.C. and 1565057 to R.K. Partial support was also provided by the following: grants NIH U54CA143925 (J.G.C. and T.P.) and U54MD012388 (J.G.C. and T.P.); grants from the Alfred P. Sloan Foundation (J.G.C. and R.K.); ERCSTG project MetaPG (N.S.); the Strategic Priority Research Program of the Chinese Academy of Sciences QYZDB-SSW-SMC021 (Y.B.); the Australian National Health and Medical Research Council APP1085372 (G.A.H., J.G.C., Von Bing Yap and R.K.); the Natural Sciences and Engineering Research Council (NSERC) to D.L.G.; and the State of Arizona Technology and Research Initiative Fund (TRIF), administered by the Arizona Board of Regents, through Northern Arizona University. All NCI coauthors were supported by the Intramural Research Program of the National Cancer Institute. S.M.G. and C. Diener were supported by the Washington Research Foundation Distinguished Investigator Award.

8,821 citations


Journal ArticleDOI
TL;DR: This work presents a method named HISAT2 (hierarchical indexing for spliced alignment of transcripts 2) that can align both DNA and RNA sequences using a graph Ferragina Manzini index, and uses it to represent and search an expanded model of the human reference genome.
Abstract: The human reference genome represents only a small number of individuals, which limits its usefulness for genotyping. We present a method named HISAT2 (hierarchical indexing for spliced alignment of transcripts 2) that can align both DNA and RNA sequences using a graph Ferragina Manzini index. We use HISAT2 to represent and search an expanded model of the human reference genome in which over 14.5 million genomic variants in combination with haplotypes are incorporated into the data structure used for searching and alignment. We benchmark HISAT2 using simulated and real datasets to demonstrate that our strategy of representing a population of genomes, together with a fast, memory-efficient search algorithm, provides more detailed and accurate variant analyses than other methods. We apply HISAT2 for HLA typing and DNA fingerprinting; both applications form part of the HISAT-genotype software that enables analysis of haplotype-resolved genes or genomic regions. HISAT-genotype outperforms other computational methods and matches or exceeds the performance of laboratory-based assays. A graph-based genome indexing scheme enables variant-aware alignment of sequences with very low memory requirements.

4,855 citations


Journal ArticleDOI
TL;DR: Comparing the performance of UMAP with five other tools, it is found that UMAP provides the fastest run times, highest reproducibility and the most meaningful organization of cell clusters.
Abstract: Advances in single-cell technologies have enabled high-resolution dissection of tissue composition. Several tools for dimensionality reduction are available to analyze the large number of parameters generated in single-cell studies. Recently, a nonlinear dimensionality-reduction technique, uniform manifold approximation and projection (UMAP), was developed for the analysis of any type of high-dimensional data. Here we apply it to biological data, using three well-characterized mass cytometry and single-cell RNA sequencing datasets. Comparing the performance of UMAP with five other tools, we find that UMAP provides the fastest run times, highest reproducibility and the most meaningful organization of cell clusters. The work highlights the use of UMAP for improved visualization and interpretation of single-cell data.

3,016 citations


Journal ArticleDOI
TL;DR: A deep neural network-based approach that improves SP prediction across all domains of life and distinguishes between three types of prokaryotic SPs is presented.
Abstract: Signal peptides (SPs) are short amino acid sequences in the amino terminus of many newly synthesized proteins that target proteins into, or across, membranes. Bioinformatic tools can predict SPs from amino acid sequences, but most cannot distinguish between various types of signal peptides. We present a deep neural network-based approach that improves SP prediction across all domains of life and distinguishes between three types of prokaryotic SPs.

2,732 citations


Journal ArticleDOI
TL;DR: Flye as mentioned in this paper constructs an accurate repeat graph from these error-riddled disjointigs by generating arbitrary paths in an unknown repeat graph, which can then be used for genome assembly.
Abstract: Accurate genome assembly is hampered by repetitive regions. Although long single molecule sequencing reads are better able to resolve genomic repeats than short-read data, most long-read assembly algorithms do not provide the repeat characterization necessary for producing optimal assemblies. Here, we present Flye, a long-read assembly algorithm that generates arbitrary paths in an unknown repeat graph, called disjointigs, and constructs an accurate repeat graph from these error-riddled disjointigs. We benchmark Flye against five state-of-the-art assemblers and show that it generates better or comparable assemblies, while being an order of magnitude faster. Flye nearly doubled the contiguity of the human genome assembly (as measured by the NGA50 assembly quality metric) compared with existing assemblers.

1,927 citations


Journal ArticleDOI
TL;DR: The utility of CIBERSORTx is evaluated in multiple tumor types, including melanoma, where single-cell reference profiles were used to dissect bulk clinical specimens, revealing cell-type-specific phenotypic states linked to distinct driver mutations and response to immune checkpoint blockade.
Abstract: Single-cell RNA-sequencing has emerged as a powerful technique for characterizing cellular heterogeneity, but it is currently impractical on large sample cohorts and cannot be applied to fixed specimens collected as part of routine clinical care. We previously developed an approach for digital cytometry, called CIBERSORT, that enables estimation of cell type abundances from bulk tissue transcriptomes. We now introduce CIBERSORTx, a machine learning method that extends this framework to infer cell-type-specific gene expression profiles without physical cell isolation. By minimizing platform-specific variation, CIBERSORTx also allows the use of single-cell RNA-sequencing data for large-scale tissue dissection. We evaluated the utility of CIBERSORTx in multiple tumor types, including melanoma, where single-cell reference profiles were used to dissect bulk clinical specimens, revealing cell-type-specific phenotypic states linked to distinct driver mutations and response to immune checkpoint blockade. We anticipate that digital cytometry will augment single-cell profiling efforts, enabling cost-effective, high-throughput tissue characterization without the need for antibodies, disaggregation or viable cells. CIBERSORTx, a suite of computational tools, enables inference of cell type abundance and cell-type-specific gene expression profiles from bulk RNA profiles.

1,812 citations


Journal ArticleDOI
TL;DR: Although wearable biosensors hold promise, a better understanding of the correlations between analyte concentrations in the blood and noninvasive biofluids is needed to improve reliability.
Abstract: Wearable biosensors are garnering substantial interest due to their potential to provide continuous, real-time physiological information via dynamic, noninvasive measurements of biochemical markers in biofluids, such as sweat, tears, saliva and interstitial fluid. Recent developments have focused on electrochemical and optical biosensors, together with advances in the noninvasive monitoring of biomarkers including metabolites, bacteria and hormones. A combination of multiplexed biosensing, microfluidic sampling and transport systems have been integrated, miniaturized and combined with flexible materials for improved wearability and ease of operation. Although wearable biosensors hold promise, a better understanding of the correlations between analyte concentrations in the blood and noninvasive biofluids is needed to improve reliability. An expanded set of on-body bioaffinity assays and more sensing strategies are needed to make more biomarkers accessible to monitoring. Large-cohort validation studies of wearable biosensor performance will be needed to underpin clinical acceptance. Accurate and reliable real-time sensing of physiological information using wearable biosensor technologies would have a broad impact on our daily lives.

1,579 citations


Journal ArticleDOI
TL;DR: The authors comprehensively benchmark the accuracy, scalability, stability and usability of 45 single-cell trajectory inference methods and develop a set of guidelines to help users select the best method for their dataset.
Abstract: Trajectory inference approaches analyze genome-wide omics data from thousands of single cells and computationally infer the order of these cells along developmental trajectories. Although more than 70 trajectory inference tools have already been developed, it is challenging to compare their performance because the input they require and output models they produce vary substantially. Here, we benchmark 45 of these methods on 110 real and 229 synthetic datasets for cellular ordering, topology, scalability and usability. Our results highlight the complementarity of existing tools, and that the choice of method should depend mostly on the dataset dimensions and trajectory topology. Based on these results, we develop a set of guidelines to help users select the best method for their dataset. Our freely available data and evaluation pipeline ( https://benchmark.dynverse.org ) will aid in the development of improved tools designed to analyze increasingly large and complex single-cell datasets.

928 citations


Journal ArticleDOI
TL;DR: The optimization of circular consensus sequencing (CCS) is reported to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb).
Abstract: The DNA sequencing technologies in use today produce either highly accurate short reads or less-accurate long reads. We report the optimization of circular consensus sequencing (CCS) to improve the accuracy of single-molecule real-time (SMRT) sequencing (PacBio) and generate highly accurate (99.8%) long high-fidelity (HiFi) reads with an average length of 13.5 kilobases (kb). We applied our approach to sequence the well-characterized human HG002/NA24385 genome and obtained precision and recall rates of at least 99.91% for single-nucleotide variants (SNVs), 95.98% for insertions and deletions 15 megabases (Mb) and concordance of 99.997%, substantially outperforming assembly with less-accurate long reads. High-fidelity reads improve variant detection and genome assembly on the PacBio platform.

876 citations


Journal ArticleDOI
TL;DR: In this article, the authors present CRISPResso2 to analyze base editors, perform allele-specific quantification or incorporate biologically-informed and scalable alignment approaches, and demonstrate its functionality by experimentally measuring and analyzing the editing properties of six genome editing agents.
Abstract: Genome editing technologies are rapidly evolving, and analysis of deep sequencing data from target or off-target regions is necessary for measuring editing efficiency and evaluating safety. However, no software exists to analyze base editors, perform allele-specific quantification or that incorporates biologically-informed and scalable alignment approaches. Here, we present CRISPResso2 to fill this gap and illustrate its functionality by experimentally measuring and analyzing the editing properties of six genome editing agents.

696 citations


Journal ArticleDOI
TL;DR: A machine learning model allows the identification of new small-molecule kinase inhibitors in days and is used to discover potent inhibitors of discoidin domain receptor 1 (DDR1), a kinase target implicated in fibrosis and other diseases, in 21 days.
Abstract: We have developed a deep generative model, generative tensorial reinforcement learning (GENTRL), for de novo small-molecule design. GENTRL optimizes synthetic feasibility, novelty, and biological activity. We used GENTRL to discover potent inhibitors of discoidin domain receptor 1 (DDR1), a kinase target implicated in fibrosis and other diseases, in 21 days. Four compounds were active in biochemical assays, and two were validated in cell-based assays. One lead candidate was tested and demonstrated favorable pharmacokinetics in mice.

Journal ArticleDOI
TL;DR: It is found that PHATE consistently preserves a range of patterns in data, including continual progressions, branches and clusters, better than other tools, and is applicable to a wide variety of data types.
Abstract: The high-dimensional data created by high-throughput technologies require visualization tools that reveal data structure and patterns in an intuitive form. We present PHATE, a visualization method that captures both local and global nonlinear structure using an information-geometric distance between data points. We compare PHATE to other tools on a variety of artificial and biological datasets, and find that it consistently preserves a range of patterns in data, including continual progressions, branches and clusters, better than other tools. We define a manifold preservation metric, which we call denoised embedding manifold preservation (DEMaP), and show that PHATE produces lower-dimensional embeddings that are quantitatively better denoised as compared to existing visualization methods. An analysis of a newly generated single-cell RNA sequencing dataset on human germ-layer differentiation demonstrates how PHATE reveals unique biological insight into the main developmental branches, including identification of three previously undescribed subpopulations. We also show that PHATE is applicable to a wide variety of data types, including mass cytometry, single-cell RNA sequencing, Hi-C and gut microbiome data.

Journal ArticleDOI
TL;DR: Analysis of scATAC-seq profiles from serial tumor biopsies before and after programmed cell death protein 1 blockade identifies chromatin regulators of therapy-responsive T cell subsets and reveals a shared regulatory program that governs intratumoral T cell exhaustion and CD4+ T follicular helper cell development.
Abstract: Understanding complex tissues requires single-cell deconstruction of gene regulation with precision and scale. Here, we assess the performance of a massively parallel droplet-based method for mapping transposase-accessible chromatin in single cells using sequencing (scATAC-seq). We apply scATAC-seq to obtain chromatin profiles of more than 200,000 single cells in human blood and basal cell carcinoma. In blood, application of scATAC-seq enables marker-free identification of cell type-specific cis- and trans-regulatory elements, mapping of disease-associated enhancer activity and reconstruction of trajectories of cellular differentiation. In basal cell carcinoma, application of scATAC-seq reveals regulatory networks in malignant, stromal and immune cells in the tumor microenvironment. Analysis of scATAC-seq profiles from serial tumor biopsies before and after programmed cell death protein 1 blockade identifies chromatin regulators of therapy-responsive T cell subsets and reveals a shared regulatory program that governs intratumoral CD8+ T cell exhaustion and CD4+ T follicular helper cell development. We anticipate that scATAC-seq will enable the unbiased discovery of gene regulatory factors across diverse biological systems.

Journal ArticleDOI
TL;DR: The links between plant genotype and root microbiota membership established in this study will inform breeding strategies to improve nitrogen use in crops and coordinate recruitment of the root microbiota to optimize nitrogen acquisition from soil.
Abstract: Nitrogen-use efficiency of indica varieties of rice is superior to that of japonica varieties. We apply 16S ribosomal RNA gene profiling to characterize root microbiota of 68 indica and 27 japonica varieties grown in the field. We find that indica and japonica recruit distinct root microbiota. Notably, indica-enriched bacterial taxa are more diverse, and contain more genera with nitrogen metabolism functions, than japonica-enriched taxa. Using genetic approaches, we provide evidence that NRT1.1B, a rice nitrate transporter and sensor, is associated with the recruitment of a large proportion of indica-enriched bacteria. Metagenomic sequencing reveals that the ammonification process is less abundant in the root microbiome of the nrt1.1b mutant. We isolated 1,079 pure bacterial isolates from indica and japonica roots and derived synthetic communities (SynComs). Inoculation of IR24, an indica variety, with an indica-enriched SynCom improved rice growth in organic nitrogen conditions compared with a japonica-enriched SynCom. The links between plant genotype and root microbiota membership established in this study will inform breeding strategies to improve nitrogen use in crops.

Journal ArticleDOI
TL;DR: Development of next-generation crops will be enabled by combining state-of-the-art technologies with speed breeding by using speed breeding to enable plant breeders to keep pace with a changing environment and ever-increasing human population.
Abstract: Crop improvements can help us to meet the challenge of feeding a population of 10 billion, but can we breed better varieties fast enough? Technologies such as genotyping, marker-assisted selection, high-throughput phenotyping, genome editing, genomic selection and de novo domestication could be galvanized by using speed breeding to enable plant breeders to keep pace with a changing environment and ever-increasing human population.

Journal ArticleDOI
TL;DR: Droplet-based single-nucleus chromatin accessibility and mRNA expression sequencing (SNARE-seq), a method that can link a cell’s transcriptome with its accessible chromatin for sequencing at scale, is described and reconstructed the transcriptome and epigenetic landscapes of major and rare cell types.
Abstract: Single-cell RNA sequencing can reveal the transcriptional state of cells, yet provides little insight into the upstream regulatory landscape associated with open or accessible chromatin regions. Joint profiling of accessible chromatin and RNA within the same cells would permit direct matching of transcriptional regulation to its outputs. Here, we describe droplet-based single-nucleus chromatin accessibility and mRNA expression sequencing (SNARE-seq), a method that can link a cell's transcriptome with its accessible chromatin for sequencing at scale. Specifically, accessible sites are captured by Tn5 transposase in permeabilized nuclei to permit, within many droplets in parallel, DNA barcode tagging together with the mRNA molecules from the same cells. To demonstrate the utility of SNARE-seq, we generated joint profiles of 5,081 and 10,309 cells from neonatal and adult mouse cerebral cortices, respectively. We reconstructed the transcriptome and epigenetic landscapes of major and rare cell types, uncovered lineage-specific accessible sites, especially for low-abundance cells, and connected the dynamics of promoter accessibility with transcription level during neurogenesis.

Journal ArticleDOI
TL;DR: Scanorama is an algorithm that identifies and merges the shared cell types among all pairs of datasets and accurately integrates heterogeneous collections of scRNA-seq data and is orders of magnitude faster than existing techniques.
Abstract: Integration of single-cell RNA sequencing (scRNA-seq) data from multiple experiments, laboratories and technologies can uncover biological insights, but current methods for scRNA-seq data integration are limited by a requirement for datasets to derive from functionally similar cells. We present Scanorama, an algorithm that identifies and merges the shared cell types among all pairs of datasets and accurately integrates heterogeneous collections of scRNA-seq data. We applied Scanorama to integrate and remove batch effects across 105,476 cells from 26 diverse scRNA-seq experiments representing 9 different technologies. Scanorama is sensitive to subtle temporal changes within the same cell lineage, successfully integrating functionally similar cells across time series data of CD14+ monocytes at different stages of differentiation into macrophages. Finally, we show that Scanorama is orders of magnitude faster than existing techniques and can integrate a collection of 1,095,538 cells in just ~9 h. Scanorama integrates single-cell RNA-seq datasets from different tissues, different labs, different experiments or different technologies.

Journal ArticleDOI
TL;DR: This work presents vConTACT v.2.0, a network-based application utilizing whole genome gene-sharing profiles for virus taxonomy that integrates distance-based hierarchical clustering and confidence scores for all taxonomic predictions, and applies it to analyze 15,280 Global Ocean Virome genome fragments.
Abstract: Microbiomes from every environment contain a myriad of uncultivated archaeal and bacterial viruses, but studying these viruses is hampered by the lack of a universal, scalable taxonomic framework. We present vConTACT v.2.0, a network-based application utilizing whole genome gene-sharing profiles for virus taxonomy that integrates distance-based hierarchical clustering and confidence scores for all taxonomic predictions. We report near-identical (96%) replication of existing genus-level viral taxonomy assignments from the International Committee on Taxonomy of Viruses for National Center for Biotechnology Information virus RefSeq. Application of vConTACT v.2.0 to 1,364 previously unclassified viruses deposited in virus RefSeq as reference genomes produced automatic, high-confidence genus assignments for 820 of the 1,364. We applied vConTACT v.2.0 to analyze 15,280 Global Ocean Virome genome fragments and were able to provide taxonomic assignments for 31% of these data, which shows that our algorithm is scalable to very large metagenomic datasets. Our taxonomy tool can be automated and applied to metagenomes from any environment for virus classification.

Journal ArticleDOI
TL;DR: It is shown that both mouse and human iPSCs lose their immunogenicity when major histocompatibility complex (MHC) class I and II genes are inactivated and CD47 is over-expressed, which suggests that hypoimmunogenic cell grafts can be engineered for universal transplantation.
Abstract: Autologous induced pluripotent stem cells (iPSCs) constitute an unlimited cell source for patient-specific cell-based organ repair strategies. However, their generation and subsequent differentiation into specific cells or tissues entail cell line-specific manufacturing challenges and form a lengthy process that precludes acute treatment modalities. These shortcomings could be overcome by using prefabricated allogeneic cell or tissue products, but the vigorous immune response against histo-incompatible cells has prevented the successful implementation of this approach. Here we show that both mouse and human iPSCs lose their immunogenicity when major histocompatibility complex (MHC) class I and II genes are inactivated and CD47 is over-expressed. These hypoimmunogenic iPSCs retain their pluripotent stem cell potential and differentiation capacity. Endothelial cells, smooth muscle cells, and cardiomyocytes derived from hypoimmunogenic mouse or human iPSCs reliably evade immune rejection in fully MHC-mismatched allogeneic recipients and survive long-term without the use of immunosuppression. These findings suggest that hypoimmunogenic cell grafts can be engineered for universal transplantation.

Journal ArticleDOI
TL;DR: In this article, an enhanced acidaminococcus sp. Cas12a variant (enAsCas12a) was proposed to improve the efficiency of multiplex gene editing, endogenous gene activation and C-to-T base editing.
Abstract: Broad use of CRISPR-Cas12a (formerly Cpf1) nucleases1 has been hindered by the requirement for an extended TTTV protospacer adjacent motif (PAM)2. To address this limitation, we engineered an enhanced Acidaminococcus sp. Cas12a variant (enAsCas12a) that has a substantially expanded targeting range, enabling targeting of many previously inaccessible PAMs. On average, enAsCas12a exhibits a twofold higher genome editing activity on sites with canonical TTTV PAMs compared to wild-type AsCas12a, and we successfully grafted a subset of mutations from enAsCas12a onto other previously described AsCas12a variants3 to enhance their activities. enAsCas12a improves the efficiency of multiplex gene editing, endogenous gene activation and C-to-T base editing, and we engineered a high-fidelity version of enAsCas12a (enAsCas12a-HF1) to reduce off-target effects. Both enAsCas12a and enAsCas12a-HF1 function in HEK293T and primary human T cells when delivered as ribonucleoprotein (RNP) complexes. Collectively, enAsCas12a provides an optimized version of Cas12a that should enable wider application of Cas12a enzymes for gene and epigenetic editing.

Journal ArticleDOI
TL;DR: Paddy trials showed that genome-edited SWEET promoters endow rice lines with robust, broad-spectrum resistance to all Xanthomonas bacterial blight strains tested.
Abstract: Bacterial blight of rice is an important disease in Asia and Africa. The pathogen, Xanthomonas oryzae pv. oryzae (Xoo), secretes one or more of six known transcription-activator-like effectors (TALes) that bind specific promoter sequences and induce, at minimum, one of the three host sucrose transporter genes SWEET11, SWEET13 and SWEET14, the expression of which is required for disease susceptibility. We used CRISPR-Cas9-mediated genome editing to introduce mutations in all three SWEET gene promoters. Editing was further informed by sequence analyses of TALe genes in 63 Xoo strains, which revealed multiple TALe variants for SWEET13 alleles. Mutations were also created in SWEET14, which is also targeted by two TALes from an African Xoo lineage. A total of five promoter mutations were simultaneously introduced into the rice line Kitaake and the elite mega varieties IR64 and Ciherang-Sub1. Paddy trials showed that genome-edited SWEET promoters endow rice lines with robust, broad-spectrum resistance.

Journal ArticleDOI
TL;DR: The improved resource of gastrointestinal bacterial reference sequences circumvents dependence on de novo assembly of metagenomes and enables accurate and cost-effective shotgun metagenomic analyses of human gastrointestinal microbiota.
Abstract: Understanding gut microbiome functions requires cultivated bacteria for experimental validation and reference bacterial genome sequences to interpret metagenome datasets and guide functional analyses. We present the Human Gastrointestinal Bacteria Culture Collection (HBC), a comprehensive set of 737 whole-genome-sequenced bacterial isolates, representing 273 species (105 novel species) from 31 families found in the human gastrointestinal microbiota. The HBC increases the number of bacterial genomes derived from human gastrointestinal microbiota by 37%. The resulting global Human Gastrointestinal Bacteria Genome Collection (HGG) classifies 83% of genera by abundance across 13,490 shotgun-sequenced metagenomic samples, improves taxonomic classification by 61% compared to the Human Microbiome Project (HMP) genome collection and achieves subspecies-level classification for almost 50% of sequences. The improved resource of gastrointestinal bacterial reference sequences circumvents dependence on de novo assembly of metagenomes and enables accurate and cost-effective shotgun metagenomic analyses of human gastrointestinal microbiota.

Journal ArticleDOI
TL;DR: This work systematically study the influence of flanking DNA sequence on repair outcome by measuring the edits generated by >40,000 guide RNAs (gRNAs) in synthetic constructs and uncover sequence determinants of the mutations produced and use these to derive a predictor of Cas9 editing outcomes.
Abstract: The DNA mutation produced by cellular repair of a CRISPR-Cas9-generated double-strand break determines its phenotypic effect. It is known that the mutational outcomes are not random, but depend on DNA sequence at the targeted location. Here we systematically study the influence of flanking DNA sequence on repair outcome by measuring the edits generated by >40,000 guide RNAs (gRNAs) in synthetic constructs. We performed the experiments in a range of genetic backgrounds and using alternative CRISPR-Cas9 reagents. In total, we gathered data for >109 mutational outcomes. The majority of reproducible mutations are insertions of a single base, short deletions or longer microhomology-mediated deletions. Each gRNA has an individual cell-line-dependent bias toward particular outcomes. We uncover sequence determinants of the mutations produced and use these to derive a predictor of Cas9 editing outcomes. Improved understanding of sequence repair will allow better design of gene editing experiments.

Journal ArticleDOI
TL;DR: Nanopore sequencing coupled with a metagenomics framework that effectively removes human DNA from samples enables rapid bacterial LRI diagnosis and might contribute to a reduction in broad-spectrum antibiotic use.
Abstract: The gold standard for clinical diagnosis of bacterial lower respiratory infections (LRIs) is culture, which has poor sensitivity and is too slow to guide early, targeted antimicrobial therapy. Metagenomic sequencing could identify LRI pathogens much faster than culture, but methods are needed to remove the large amount of human DNA present in these samples for this approach to be feasible. We developed a metagenomics method for bacterial LRI diagnosis that features efficient saponin-based host DNA depletion and nanopore sequencing. Our pilot method was tested on 40 samples, then optimized and tested on a further 41 samples. Our optimized method (6 h from sample to result) was 96.6% sensitive and 41.7% specific for pathogen detection compared with culture and we could accurately detect antibiotic resistance genes. After confirmatory quantitative PCR and pathobiont-specific gene analyses, specificity and sensitivity increased to 100%. Nanopore metagenomics can rapidly and accurately characterize bacterial LRIs and might contribute to a reduction in broad-spectrum antibiotic use.

Journal ArticleDOI
TL;DR: A collection of 1,520 nonredundant, high-quality draft genomes generated from >6,000 bacteria cultivated from fecal samples of healthy humans, chosen to cover all major bacterial phyla and genera in the human gut.
Abstract: Reference genomes are essential for metagenomic analyses and functional characterization of the human gut microbiota. We present the Culturable Genome Reference (CGR), a collection of 1,520 nonredundant, high-quality draft genomes generated from >6,000 bacteria cultivated from fecal samples of healthy humans. Of the 1,520 genomes, which were chosen to cover all major bacterial phyla and genera in the human gut, 264 are not represented in existing reference genome catalogs. We show that this increase in the number of reference bacterial genomes improves the rate of mapping metagenomic sequencing reads from 50% to >70%, enabling higher-resolution descriptions of the human gut microbiome. We use the CGR genomes to annotate functions of 338 bacterial species, showing the utility of this resource for functional studies. We also carry out a pan-genome analysis of 38 important human gut species, which reveals the diversity and specificity of functional enrichment between their core and dispensable genomes.

Journal ArticleDOI
TL;DR: The range of biochemical analytes that can be sensed in dermal interstitial fluid, saliva and sweat are surveyed, and criteria for tailoring sensor design to address the right analyte in the right body site for the right disease or wellness application are outlined.
Abstract: Peripheral biochemical monitoring involves the use of wearable devices for minimally invasive or noninvasive measurement of analytes in biofluids such as interstitial fluid, saliva, tears and sweat. The goal in most cases is to obtain measurements that serve as surrogates for circulating analyte concentrations in blood. Key technological developments to date include continuous glucose monitors, which use an indwelling sensor needle to measure glucose in interstitial fluid, and device-integrated sweat stimulation for continuous access to analytes in sweat. Further development of continuous sensing technologies through new electrochemical sensing modalities will be a major focus of future research. While there has been much investment in wearable technologies to sense analytes, less effort has been directed to understanding the physiology of biofluid secretion. Elucidating the underlying biology is crucial for accelerating technological progress, as the biofluid itself often presents the greatest challenge in terms of sample volumes, secretion rates, filtration, active analyte channels, variable pH and salinity, analyte breakdown and other confounding factors.

Journal ArticleDOI
TL;DR: It is shown that Palantir outperforms existing algorithms in identifying cell lineages and recapitulating gene expression trends during differentiation, is generalizable to diverse tissue types, and is well-suited to resolving less-studied differentiating systems.
Abstract: Single-cell RNA sequencing studies of differentiating systems have raised fundamental questions regarding the discrete versus continuous nature of both differentiation and cell fate. Here we present Palantir, an algorithm that models trajectories of differentiating cells by treating cell fate as a probabilistic process and leverages entropy to measure cell plasticity along the trajectory. Palantir generates a high-resolution pseudo-time ordering of cells and, for each cell state, assigns a probability of differentiating into each terminal state. We apply our algorithm to human bone marrow single-cell RNA sequencing data and detect important landmarks of hematopoietic differentiation. Palantir's resolution enables the identification of key transcription factors that drive lineage fate choice and closely track when cells lose plasticity. We show that Palantir outperforms existing algorithms in identifying cell lineages and recapitulating gene expression trends during differentiation, is generalizable to diverse tissue types, and is well-suited to resolving less-studied differentiating systems.

Journal ArticleDOI
TL;DR: The MIUViG (Minimum Information about an Uncultivated Virus Genome) as mentioned in this paper standard was developed within the Genomic Standards Consortium framework and includes virus origin, genome quality, genome annotation, taxonomic classification, biogeographic distribution and in silico host prediction.
Abstract: We present an extension of the Minimum Information about any (x) Sequence (MIxS) standard for reporting sequences of uncultivated virus genomes. Minimum Information about an Uncultivated Virus Genome (MIUViG) standards were developed within the Genomic Standards Consortium framework and include virus origin, genome quality, genome annotation, taxonomic classification, biogeographic distribution and in silico host prediction. Community-wide adoption of MIUViG standards, which complement the Minimum Information about a Single Amplified Genome (MISAG) and Metagenome-Assembled Genome (MIMAG) standards for uncultivated bacteria and archaea, will improve the reporting of uncultivated virus genomes in public databases. In turn, this should enable more robust comparative studies and a systematic exploration of the global virosphere.

Journal ArticleDOI
TL;DR: In this article, a cubic-phase erbium-based rare-earth nanoparticles (ErNPs) were used for dynamic imaging of cancer immunotherapy in mice, which achieved tumor-to-normal tissue signal ratios of 40.6% in a mouse model of colon cancer.
Abstract: The near-infrared-IIb (NIR-IIb) (1,500–1,700 nm) window is ideal for deep-tissue optical imaging in mammals, but lacks bright and biocompatible probes. Here, we developed biocompatible cubic-phase (α-phase) erbium-based rare-earth nanoparticles (ErNPs) exhibiting bright downconversion luminescence at ~1,600 nm for dynamic imaging of cancer immunotherapy in mice. We used ErNPs functionalized with cross-linked hydrophilic polymer layers attached to anti-PD-L1 (programmed cell death-1 ligand-1) antibody for molecular imaging of PD-L1 in a mouse model of colon cancer and achieved tumor-to-normal tissue signal ratios of ~40. The long luminescence lifetime of ErNPs (~4.6 ms) enabled simultaneous imaging of ErNPs and lead sulfide quantum dots emitting in the same ~1,600 nm window. In vivo NIR-IIb molecular imaging of PD-L1 and CD8 revealed cytotoxic T lymphocytes in the tumor microenvironment in response to immunotherapy, and altered CD8 signals in tumor and spleen due to immune activation. The cross-linked functionalization layer facilitated 90% ErNP excretion within 2 weeks without detectable toxicity in mice. Biocompatible rare-earth nanoparticles with an emission maximum at 1,600 nm enable sensitive in vivo imaging.

Journal ArticleDOI
TL;DR: A combinatorial library of ionizable lipid-like materials is developed to identify mRNA delivery vehicles that facilitate mRNA delivery in vivo and provide potent and specific immune activation, and result in limited systemic cytokine expression and enhanced anti-tumor efficacy.
Abstract: Therapeutic messenger RNA vaccines enable delivery of whole antigens, which can be advantageous over peptide vaccines. However, optimal efficacy requires both intracellular delivery, to allow antigen translation, and appropriate immune activation. Here, we developed a combinatorial library of ionizable lipid-like materials to identify mRNA delivery vehicles that facilitate mRNA delivery in vivo and provide potent and specific immune activation. Using a three-dimensional multi-component reaction system, we synthesized and evaluated the vaccine potential of over 1,000 lipid formulations. The top candidate formulations induced a robust immune response, and were able to inhibit tumor growth and prolong survival in melanoma and human papillomavirus E7 in vivo tumor models. The top-performing lipids share a common structure: an unsaturated lipid tail, a dihydroimidazole linker and cyclic amine head groups. These formulations induce antigen-presenting cell maturation via the intracellular stimulator of interferon genes (STING) pathway, rather than through Toll-like receptors, and result in limited systemic cytokine expression and enhanced anti-tumor efficacy.