A Single-cell RNA-seq Training and Analysis Suite using the Galaxy Framework
read more
Citations
Revealing the vectors of cellular identity with single-cell genomics
From bench to bedside: Single-cell analysis for cancer immunotherapy
Building Domain-Specific Machine Learning Workflows: A Conceptual Framework for the State-of-the-Practice
References
STAR: ultrafast universal RNA-seq aligner
Visualizing Data using t-SNE
Fast unfolding of communities in large networks
Fast unfolding of communities in large networks
Salmon provides fast and bias-aware quantification of transcript expression
Related Papers (5)
Over 1000 tools reveal trends in the single-cell RNA-seq analysis landscape
Galaxy tools and workflows for sequence analysis with applications in molecular plant pathology
Frequently Asked Questions (17)
Q2. What was the primary language for scRNA-seq?
ScanPy was developed as the Python alternative to the innumerable R-based packages for scRNA-seq which was the dominant language for such analyses, and it was one of the rst packages with native 10x genomics support.
Q3. What are the main stages of scRNA-seq analysis?
The downstream modules are dened by the ve main stages of downstream scRNA-seq analysis: ltering, normalisation, confounder removal, clustering, and trajectory inference.
Q4. What are the main problems of the rst wave of software utilities?
The rst wave of software utilities to deal with the analysis of single cell datasets were statistical packages, aimed at tackling the issue of “dropout events” during sequencing, which would manifest as a high prevalence of zero-entries in over 80% of the featurecount matrix.
Q5. What is the purpose of the training materials?
The teaching and training materials are part of the Galaxy Training Network (GTN), which is a worldwide collaborative e ort to produce high-quality teaching material in order to educate users in how to analyse their data, and in turn to train others of the same materials via easily deployable workshops backed by monthly stable releases of the GTN materials [19].
Q6. What are some of the sources of variability in the cell cycle?
Other sources of variability stem from unwanted biological contributions known as confounder e ects, such as cell cycle e ects and transcription.
Q7. What is the common pitfall at this stage?
One common pitfall at this very rst stage is estimating how many cells to expect from the FASTQ input data, and this requires three crucial pieces of information: which reads contain the barcodes (or precisely, which subset of both the forward and reverse reads contains the barcodes); of these barcodes, which speci c ones were actually used for the analysis; and how to resolve barcode mismatches/errors.
Q8. What is the purpose of the normalisation step?
The normalisation step aims to remove any technical factors that are not relevant to the analysis, such as the library size, where cells sharing the same identity are likely to di er from one another more by the number of transcripts they exhibit, than due to more relevant biological factors.
Q9. What is the role of language translation in the training materials?
The GTN also makes use of language translation tools to provide international interpretations of the trainingmaterials in order to reach a wider more internationally diverse audience.
Q10. What is the future of scRNA sequencing?
The advent of scRNA-seq analysis within the Galaxy framework re-echoes the e orts to standardise the analysis of scRNA-seq with the promise of presenting reproducible research.
Q11. What is the purpose of the tutorials?
The tutorials are designed to broadly appeal to both the biologist and the statistician, as well as complete beginners to the entire topic.
Q12. What was the main reason for the development of standalone analysis suites?
Standalone analysis suites emerged as the di erent authors of these packages rapidly expanded their methods to encapsulate all facets of the single-cell analysis, often creating compatibility issues with previous package versions.
Q13. What is the main purpose of the Galaxy framework?
the Galaxy framework abstracts the user from the many nontrivial technicalities of the analysis, and exposes them to a legible interface of tools that they can pick and choose from.
Q14. What was the main focus of the analysis of scRNA-seq within Galaxy?
The analysis of scRNA-seq within Galaxy was a two-pronged e ort concentrated on bringing high quality single-cell tools into Galaxy, and providing the necessary work ows and training to accompany them.
Q15. Who has granted bioRxiv a license to display the preprint in perpetuity?
CC-BY 4.0 International licensemade available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
Q16. What was the main reason for the incompatibility between packages?
This incompatibility between packages fuelled a choice of one analysis suite over another, or conversely required researchers to dig deeper into the internal semantics of R S4 objects in order to manually slot data components together [12].
Q17. What are the prerequisites for a tutorial?
These tutorials can also declare prerequisites, so that users can review required concepts from previous tutorials, e.g. quality control checks from bulk RNA-seq still being used in scRNA-seq.