A deep-learning tool for species-agnostic integration of cancer cell states
Genetically engineered mouse models (GEMM) of cancer are a useful tool for exploring the development and biological composition of human tumors and, when combined with single-cell RNA-sequencing (scRNA-seq), provide a transcriptomic snapshot of cancer data to explore heterogeneity of cell states in an immunocompetent context. However, cross-species comparison often suffers from biological batch effect and inherent differences between mice and humans decreases the signal of biological insights that can be gleaned from these models. Here, we develop scVital, a computational tool that uses a variational autoencoder and discriminator to embed scRNA-seq data into a species-agnostic latent space to overcome batch effect and identify cell states shared between species. We introduce the latent space similarity (LSS) score, a new metric designed to evaluate batch correction accuracy by leveraging pre-labeled clusters for scoring instead of the current method of creating new clusters. Using this