The Impact of Representation on the Optimization of Marker Panels for Single-cell RNA Data

doi:10.1109/CEC45853.2021.9504808

Proceedings Article•DOI•

The Impact of Representation on the Optimization of Marker Panels for Single-cell RNA Data

Andrea Tangherloni¹, Simone G. Riva², Simone Spolaor³, Daniela Besozzi³, Marco S. Nobile⁴, Paolo Cazzaniga¹ - Show less +2 more•Institutions (4)

University of Bergamo¹, Wellcome Trust Sanger Institute², University of Milano-Bicocca³, Eindhoven University of Technology⁴

28 Jun 2021-pp 1423-1430

TL;DR: A GA-based approach to solve the problem of the identification of succinct marker panels, and shows that the marker panels identified by GAs can outperform manually curated solutions, especially in the case of 0-knowledge problems.

read less

Abstract: The increasing number of single-cell transcriptomic and single-cell RNA sequencing studies are allowing for a deeper understanding of the molecular processes underlying the normal development of an organism as well as the onset of pathologies. These studies continuously refine the functional roles of known cell populations, and provide their characterization as soon as putatively novel cell populations are detected. In order to isolate the cell populations for further tailored analysis, succinct marker panels—composed of a few cell surface proteins and clusters of differentiation molecules—must be identified. The identification of these marker panels is a challenging computational problem due to its intrinsic combinatorial nature, which makes it an NP-hard problem. Genetic Algorithms (GAs) have been successfully used in Bioinformatics and other biomedical applications to tackle combinatorial problems. We present here a GA-based approach to solve the problem of the identification of succinct marker panels. Since the performance of a GA is strictly related to the representation of the candidate solutions, we propose and compare three alternative representations, able to implicitly introduce different constraints on the search space. For each representation, we perform a fine-tuning of the parameter settings to calibrate the GA, and we show that different representations yield different performance, where the most relaxed representations— in which the GA can also evolve the number of genes in the panel—turn out to be the more effective, especially in the case of 0-knowledge problems. Our results also show that the marker panels identified by GAs can outperform manually curated solutions.

...read moreread less

The Impact of Representation on the Optimization of Marker Panels for Single-cell RNA Data

Citations

References

"The Impact of Representation on the..." refers background in this paper

"The Impact of Representation on the..." refers methods in this paper

Related Papers (5)