Q2. What is the reason for the differences between the tumors?
The apparent differences between the tumors at single-cell level may be due to differences in copy number profiles, which are unique for each tumor and can have a strong impact on the gene expression profile 55,57,75.
Q3. What are the databases used for the analyses presented in this paper?
The databases used for the analyses presented in this paper are the "18k motif collection" from iRegulon (genebased motif rankings) for human and mouse.
Q4. How many genes were run on the unlogged matrix?
GiniClust 20 was run on the unlogged TPM matrix with the default parameters, which resulted in a matrix with 17843 genes and one single cluster.
Q5. What methods have been developed for the analysis of single-cell RNA-seq data?
methods that exploit co-expression or networks for the analysis of single-cell RNA-seq data such as “network synthesis toolkit” 24, Pina’s approach 25, PAGODA 13, and SINCERA 26 have tentatively been developed.
Q6. What were the non-tumoral cells removed from the expression matrix?
The authors removed these non-tumoral cells from the expression matrix using hierarchical clustering based on the markers cited in the article (mature oligodendrocytes and microglia, respectively).
Q7. How did the authors test the performance of SCENIC?
To test SCENIC performances the authors applied it to a scRNA-seq data set with well-known cell types from the adult mouse brain previously described in Zeisel et al.
Q8. What type of interneuron was identified in the human data set?
SCENIC identified an "interneuron-like" and a "excitatory neuron-like" subpopulation within the fetal quiescent cells in the human data set, expressing DLX1,2,5 and MAF, and NEUROD1.
Q9. How does SCENIC compare to other clustering methods?
In conclusion, SCENIC competes with the best clustering methods to discovering cell types and correctly assigning cells to each cell type; but SCENIC goes beyond existing methods by reducing data dimensionality using TF regulons rather than principal components, thereby accounting for noise and removing technical biases, and uncovering master regulators and gene regulatory networks for each cell type.
Q10. What is the preferred expression value for the first network-inference step?
note that the first network-inference step is based on co-expression, and some authors recommend avoiding within sample normalizations (i.e. TPM) for this task because they may induce artificial co-variation 82.
Q11. What are the limitations to using transcription factor motifs to filter and prune co-expression modules?
There are still limitations to using transcription factor motifs to filter and prune co-expression modules, the most obvious being that not for all transcription factors motifs are available, that some factors have motifs with higher information content than others, and that not all transcription factors are co-expressed with their target genes.
Q12. What is the main cell type in hippocampus and somatosensory cortex?
This data set has been used extensively for benchmarking purposes 13,14,20,27–31 and contains the main cell types in hippocampus and somatosensory cortex, namely neurons (pyramidal excitatory neurons, and interneurons), glia (astrocytes, oligodendrocytes, microglia), and endothelial cells.