(Open Access) SCENIC: single-cell regulatory network inference and clustering. (2017) | Sara Aibar

Q1. What have the authors contributed in "Scenic: single-cell regulatory network inference and clustering" ?

Here the authors describe a computational resource, called SCENIC ( Single Cell rEgulatory Network Inference and Clustering ), for the simultaneous reconstruction of gene regulatory networks ( GRNs ) and the identification of stable cell states, using single-cell RNA-seq data. Importantly, the authors show that cell state identification based on GRNs is robust towards batch-effects and technical-biases. The authors applied SCENIC to a compendium of single-cell data from the mouse and human brain and demonstrate that the proper combinations of transcription factors, target genes, enhancers, and cell types can be identified. The authors further validated these predictions by showing that two transcription factors are predominantly expressed in early metastatic sentinel lymph nodes. As scalable alternative to GENIE3, the authors also provide GRNboost, paving the way towards the network analysis across millions of single cells. Not peer-reviewed ) is the author/funder.

Q2. What is the reason for the differences between the tumors?

The apparent differences between the tumors at single-cell level may be due to differences in copy number profiles, which are unique for each tumor and can have a strong impact on the gene expression profile 55,57,75.

Q3. What are the databases used for the analyses presented in this paper?

The databases used for the analyses presented in this paper are the "18k motif collection" from iRegulon (genebased motif rankings) for human and mouse.

Q4. How many genes were run on the unlogged matrix?

GiniClust 20 was run on the unlogged TPM matrix with the default parameters, which resulted in a matrix with 17843 genes and one single cluster.

Q5. What methods have been developed for the analysis of single-cell RNA-seq data?

methods that exploit co-expression or networks for the analysis of single-cell RNA-seq data such as “network synthesis toolkit” 24, Pina’s approach 25, PAGODA 13, and SINCERA 26 have tentatively been developed.

Q6. What were the non-tumoral cells removed from the expression matrix?

The authors removed these non-tumoral cells from the expression matrix using hierarchical clustering based on the markers cited in the article (mature oligodendrocytes and microglia, respectively).

Q7. How did the authors test the performance of SCENIC?

To test SCENIC performances the authors applied it to a scRNA-seq data set with well-known cell types from the adult mouse brain previously described in Zeisel et al.

Q8. What type of interneuron was identified in the human data set?

SCENIC identified an "interneuron-like" and a "excitatory neuron-like" subpopulation within the fetal quiescent cells in the human data set, expressing DLX1,2,5 and MAF, and NEUROD1.

Q9. How does SCENIC compare to other clustering methods?

In conclusion, SCENIC competes with the best clustering methods to discovering cell types and correctly assigning cells to each cell type; but SCENIC goes beyond existing methods by reducing data dimensionality using TF regulons rather than principal components, thereby accounting for noise and removing technical biases, and uncovering master regulators and gene regulatory networks for each cell type.

Q10. What is the preferred expression value for the first network-inference step?

note that the first network-inference step is based on co-expression, and some authors recommend avoiding within sample normalizations (i.e. TPM) for this task because they may induce artificial co-variation 82.

Q11. What are the limitations to using transcription factor motifs to filter and prune co-expression modules?

There are still limitations to using transcription factor motifs to filter and prune co-expression modules, the most obvious being that not for all transcription factors motifs are available, that some factors have motifs with higher information content than others, and that not all transcription factors are co-expressed with their target genes.

Q12. What is the main cell type in hippocampus and somatosensory cortex?

This data set has been used extensively for benchmarking purposes 13,14,20,27–31 and contains the main cell types in hippocampus and somatosensory cortex, namely neurons (pyramidal excitatory neurons, and interneurons), glia (astrocytes, oligodendrocytes, microglia), and endothelial cells.

SCENIC: single-cell regulatory network inference and clustering.

Figures

Citations

Dimensionality reduction for visualizing single-cell data using UMAP.

Single-cell landscape of bronchoalveolar immune cells in patients with COVID-19.

Current best practices in single-cell RNA-seq analysis: a tutorial.

Severe COVID-19 Is Marked by a Dysregulated Myeloid Cell Compartment.

Phenotype molding of stromal cells in the lung tumor microenvironment.

References

Gene Ontology: tool for the unification of biology

Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.

limma powers differential expression analyses for RNA-sequencing and microarray studies

MapReduce: simplified data processing on large clusters

Greedy function approximation: A gradient boosting machine.

Related Papers (5)

Integrating single-cell transcriptomic data across different conditions, technologies, and species.

Comprehensive Integration of Single-Cell Data.

The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells

Massively parallel digital transcriptional profiling of single cells

Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets

Frequently Asked Questions (12)