scispace - formally typeset
Search or ask a question
Author

Xueyi Dong

Bio: Xueyi Dong is an academic researcher from Walter and Eliza Hall Institute of Medical Research. The author has contributed to research in topics: Bioconductor & Biology. The author has an hindex of 11, co-authored 17 publications receiving 1083 citations. Previous affiliations of Xueyi Dong include Zhejiang University & University of Melbourne.

Papers
More filters
Journal ArticleDOI
TL;DR: The current landscape of available tools is reviewed, the principles of error correction, base modification detection, and long-read transcriptomics analysis are focused on, and the challenges that remain are highlighted.
Abstract: Long-read technologies are overcoming early limitations in accuracy and throughput, broadening their application domains in genomics. Dedicated analysis tools that take into account the characteristics of long-read data are thus required, but the fast pace of development of such tools can be overwhelming. To assist in the design and analysis of long-read sequencing projects, we review the current landscape of available tools and present an online interactive database, long-read-tools.org, to facilitate their browsing. We further focus on the principles of error correction, base modification detection, and long-read transcriptomics analysis and highlight the challenges that remain.

1,172 citations

Journal ArticleDOI
TL;DR: This workflow article analyzes RNA-sequencing data from the mouse mammary gland, demonstrating use of the popular edgeR package to import, organise, filter and normalise the data, followed by the limma package with its voom method, linear modelling and empirical Bayes moderation to assess differential expression and perform gene set testing.
Abstract: The ability to easily and efficiently analyse RNA-sequencing data is a key strength of the Bioconductor project. Starting with counts summarised at the gene-level, a typical analysis involves pre-processing, exploratory data analysis, differential expression testing and pathway analysis with the results obtained informing future experiments and validation studies. In this workflow article, we analyse RNA-sequencing data from the mouse mammary gland, demonstrating use of the popular edgeR package to import, organise, filter and normalise the data, followed by the limma package with its voom method, linear modelling and empirical Bayes moderation to assess differential expression and perform gene set testing. This pipeline is further enhanced by the Glimma package which enables interactive exploration of the results so that individual samples and genes can be examined by the user. The complete analysis offered by these three packages highlights the ease with which researchers can turn the raw counts from an RNA-sequencing experiment into biological insights using Bioconductor.

386 citations

Journal ArticleDOI
TL;DR: This work generated a realistic benchmark experiment that included single cells and admixtures of cells or RNA to create ‘pseudo cells’ from up to five distinct cancer cell lines and provided a comprehensive framework for benchmarking most common scRNA-seq analysis steps.
Abstract: Single cell RNA-sequencing (scRNA-seq) technology has undergone rapid development in recent years, leading to an explosion in the number of tailored data analysis methods. However, the current lack of gold-standard benchmark datasets makes it difficult for researchers to systematically compare the performance of the many methods available. Here, we generated a realistic benchmark experiment that included single cells and admixtures of cells or RNA to create 'pseudo cells' from up to five distinct cancer cell lines. In total, 14 datasets were generated using both droplet and plate-based scRNA-seq protocols. We compared 3,913 combinations of data analysis methods for tasks ranging from normalization and imputation to clustering, trajectory analysis and data integration. Evaluation revealed pipelines suited to different types of data for different tasks. Our data and analysis provide a comprehensive framework for benchmarking most common scRNA-seq analysis steps.

237 citations

Journal ArticleDOI
TL;DR: ScPipe as mentioned in this paper is an R/Bioconductor package that integrates barcode demultiplexing, read alignment, UMI-aware gene-level quantification and quality control of raw sequencing data generated by multiple protocols.
Abstract: Single-cell RNA sequencing (scRNA-seq) technology allows researchers to profile the transcriptomes of thousands of cells simultaneously. Protocols that incorporate both designed and random barcodes have greatly increased the throughput of scRNA-seq, but give rise to a more complex data structure. There is a need for new tools that can handle the various barcoding strategies used by different protocols and exploit this information for quality assessment at the sample-level and provide effective visualization of these results in preparation for higher-level analyses. To this end, we developed scPipe, an R/Bioconductor package that integrates barcode demultiplexing, read alignment, UMI-aware gene-level quantification and quality control of raw sequencing data generated by multiple protocols that include CEL-seq, MARS-seq, Chromium 10X, Drop-seq and Smart-seq. scPipe produces a count matrix that is essential for downstream analysis along with an HTML report that summarises data quality. These results can be used as input for downstream analyses including normalization, visualization and statistical testing. scPipe performs this processing in a few simple R commands, promoting reproducible analysis of single-cell data that is compatible with the emerging suite of open-source scRNA-seq analysis tools available in R/Bioconductor and beyond. The scPipe R package is available for download from https://www.bioconductor.org/packages/scPipe.

85 citations

Journal ArticleDOI
TL;DR: It is demonstrated that Keap1-deficient Kras G12D lung tumors arise from a bronchiolar cell-of-origin, lacking pro-tumorigenic macrophages observed in tumors originating from alveolar cells, and inhibition of which, using 6-AN, abrogated tumor growth.
Abstract: The KRAS oncoprotein, a critical driver in 33% of lung adenocarcinoma (LUAD), has remained an elusive clinical target due to its perceived undruggable nature. The identification of dependencies borne through common co-occurring mutations are sought to more effectively target KRAS-mutant lung cancer. Approximately 20% of KRAS-mutant LUAD carry loss-of-function mutations in KEAP1, a negative regulator of the antioxidant response transcription factor NFE2L2/NRF2. We demonstrate that Keap1-deficient KrasG12D lung tumors arise from a bronchiolar cell-of-origin, lacking pro-tumorigenic macrophages observed in tumors originating from alveolar cells. Keap1 loss activates the pentose phosphate pathway, inhibition of which, using 6-AN, abrogated tumor growth. These studies highlight alternative therapeutic approaches to specifically target this unique subset of KRAS-mutant LUAD cancers.

58 citations


Cited by
More filters
Journal ArticleDOI
31 Oct 2018-Nature
TL;DR: This study establishes a combined transcriptomic and projectional taxonomy of cortical cell types from functionally distinct areas of the adult mouse cortex and identifies 133 transcriptomic types of glutamatergic neurons to their long-range projection specificity.
Abstract: The neocortex contains a multitude of cell types that are segregated into layers and functionally distinct areas. To investigate the diversity of cell types across the mouse neocortex, here we analysed 23,822 cells from two areas at distant poles of the mouse neocortex: the primary visual cortex and the anterior lateral motor cortex. We define 133 transcriptomic cell types by deep, single-cell RNA sequencing. Nearly all types of GABA (γ-aminobutyric acid)-containing neurons are shared across both areas, whereas most types of glutamatergic neurons were found in one of the two areas. By combining single-cell RNA sequencing and retrograde labelling, we match transcriptomic types of glutamatergic neurons to their long-range projection specificity. Our study establishes a combined transcriptomic and projectional taxonomy of cortical cell types from functionally distinct areas of the adult mouse cortex.

1,184 citations

Journal ArticleDOI
TL;DR: The authors comprehensively benchmark the accuracy, scalability, stability and usability of 45 single-cell trajectory inference methods and develop a set of guidelines to help users select the best method for their dataset.
Abstract: Trajectory inference approaches analyze genome-wide omics data from thousands of single cells and computationally infer the order of these cells along developmental trajectories. Although more than 70 trajectory inference tools have already been developed, it is challenging to compare their performance because the input they require and output models they produce vary substantially. Here, we benchmark 45 of these methods on 110 real and 229 synthetic datasets for cellular ordering, topology, scalability and usability. Our results highlight the complementarity of existing tools, and that the choice of method should depend mostly on the dataset dimensions and trajectory topology. Based on these results, we develop a set of guidelines to help users select the best method for their dataset. Our freely available data and evaluation pipeline ( https://benchmark.dynverse.org ) will aid in the development of improved tools designed to analyze increasingly large and complex single-cell datasets.

928 citations

Journal ArticleDOI
TL;DR: This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years in single-cell data science.
Abstract: The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years.

677 citations

Journal ArticleDOI
05 Feb 2021-Science
TL;DR: Treatment with FMT was associated with favorable changes in immune cell infiltrates and gene expression profiles in both the gut lamina propria and the tumor microenvironment, which have implications for modulating the gut microbiota in cancer treatment.
Abstract: The gut microbiome has been shown to influence the response of tumors to anti-PD-1 (programmed cell death-1) immunotherapy in preclinical mouse models and observational patient cohorts. However, modulation of gut microbiota in cancer patients has not been investigated in clinical trials. In this study, we performed a phase 1 clinical trial to assess the safety and feasibility of fecal microbiota transplantation (FMT) and reinduction of anti-PD-1 immunotherapy in 10 patients with anti-PD-1-refractory metastatic melanoma. We observed clinical responses in three patients, including two partial responses and one complete response. Notably, treatment with FMT was associated with favorable changes in immune cell infiltrates and gene expression profiles in both the gut lamina propria and the tumor microenvironment. These early findings have implications for modulating the gut microbiota in cancer treatment.

609 citations