Exploring dendritic cell diversity in human spleen
Background
The ProjecTILs reference map of human dendritic cells was generated by semi-supervised STACAS integration of 11 scRNA-seq datasets comprising tumor samples from 5 cancer types and blood samples obtained from Gerhard et al. JEM, 2020 and Villani et al. Science, 2017, for tumor and blood data, respectively. The code to generate the reference map is available here.
For DC subtype nomenclature we followed definitions by Gerhard et al. JEM, 2020. Additionally, we annotated in our reference the AS-DC subtype characterized by the expression of AXL, SIGLEC6, PPP1R14A, among others. This population was defined as AS (AXL SIGLEC6) or DC5 by Villani et al. Science, 2017 and as Pre-DC by others.
In this case study, we will use the ProjecTILs DC reference map to classify an external, validation dataset of human spleen-derived DCs from the study of Brown et al. (2019) Cell. This single-cell RNA-seq data generated by Brown et al. consist of Lin-CD14-CD11C+MHCII+ sorted cells (DCs) from fresh spleen tissue obtained from patients undergoing surgical resection of tumors in other organs. Single-cell RNA-seq libraries were prepared with 10X Genomics 3’ Kit (v2).
Goal
To evaluate consistency of ProjecTILs DC subtype classification on an external scRNA-seq dataset of human spleen samples.
R Environment
library(renv)
::restore()
renv
library(patchwork)
library(ggplot2)
library(reshape2)
library(Seurat)
library(ProjecTILs)
DC reference map
The ProjecTILs reference map for dendritic cells can be downloaded directly from figshare:
options(timeout = 120)
<- "DC_human_ref_v1.rds"
ref.file if (!file.exists(ref.file)) {
<- "https://figshare.com/ndownloader/files/39120752"
dataUrl download.file(dataUrl, ref.file)
}<- load.reference.map(ref.file) ref_DC
[1] "Loading Custom Reference Atlas..."
[1] "Loaded Custom Reference map DC_human"
<- ref_DC@misc$atlas.palette
palette DimPlot(ref_DC, cols = palette)
Check marker genes expression in reference subtypes
DefaultAssay(ref_DC) <- "RNA"
Idents(ref_DC) <- "functional.cluster"
<- c("CLEC9A", "XCR1", "CADM1", "CLEC10A", "FCGR2A", "FCER1A", "CD1C",
markerGenes "CD1A", "CD207", "LAMP3", "CCR7", "FSCN1", "GZMB", "LILRA4", "TCF4", "PPP1R14A",
"AXL", "S100A8", "S100A9", "VCAN", "FCN1", "ITGAX", "HLA-DRA", "HLA-DQA1", "HLA-DQB1")
VlnPlot(ref_DC, features = markerGenes, stack = TRUE, cols = palette, flip = TRUE,
fill.by = "ident")
Query data preparation
<- "input/Brown2019_humanDC_data"
ddir
if (!file.exists(ddir)) {
library(dplyr)
dir.create(ddir)
<- "https://ftp.ncbi.nlm.nih.gov/geo/samples/GSM4085nnn/GSM4085512/suppl/GSM4085512_human_spleen_raw_counts.tsv.gz"
dataUrl download.file(dataUrl, paste0(ddir, "/tmp.tsv.gz"))
<- "https://ftp.ncbi.nlm.nih.gov/geo/series/GSE137nnn/GSE137710/suppl/GSE137710_human_spleen_cell_metadata_4465x9.tsv.gz"
metadataUrl download.file(metadataUrl, paste0(ddir, "/meta.tsv.gz"))
# read count matrix
<- sprintf("%s/tmp.tsv.gz", ddir)
counts.file <- fread(file = counts.file) %>%
counts as.data.frame()
# move cell ids
row.names(counts) <- counts$V1
<- counts[, -1]
counts
# get a genes x cells matrix
<- t(counts)
counts colnames(counts) <- as.character(colnames(counts))
# Load Metadata
<- as.data.frame(fread(sprintf("%s/meta.tsv.gz", ddir)))
meta rownames(meta) <- as.character(meta$cell_ID)
# Keep cells with metadata
<- counts[, rownames(meta)]
counts all.equal(colnames(counts), rownames(meta))
# Create and save Seurat object
<- CreateSeuratObject(counts, project = "Brown2019", meta.data = meta)
brown.seurat
saveRDS(brown.seurat, paste0(ddir, "/Brown2019.rds"))
else {
} <- readRDS(paste0(ddir, "/Brown2019.rds"))
brown.seurat }
Exploration of query data
Let’s perform some standard dimensionality reductoin and visualize original DC annotations from Brown et al.
<- NormalizeData(brown.seurat) |>
brown.seurat ScaleData() |>
FindVariableFeatures() |>
RunPCA() |>
RunUMAP(dims = 1:30)
DimPlot(brown.seurat, group.by = "cell_type")
ProjecTILs analysis
Classifier-only mode
First let’s run ProjecTILs using the reference DC map in classifier mode. The classifier with only assign cell type labels to the query cells, without attempting to embed them into the space of the reference (see also this demo)
Since these are pre-sorted DCs, we will disable ProjecTILs
‘pre-filter’, setting filter.cells = F
<- ProjecTILs.classifier(brown.seurat, ref = ref_DC, filter.cells = F) brown.seurat
|
| | 0%[1] "Using assay RNA for query"
[1] "Aligning query to reference map for batch-correction..."
Projecting corrected query onto Reference PCA space
Projecting corrected query onto Reference UMAP space
|
|======================================================================| 100%
Now let’s compare the annotations by the authors with ProjecTIL’s annotations
<- DimPlot(brown.seurat, group.by = "cell_type") + ggtitle("Author original annotation")
a <- DimPlot(brown.seurat, group.by = "functional.cluster", cols = palette) + ggtitle("ProjecTILs classification")
b wrap_plots(a, b) & theme(aspect.ratio = 1)
We can also verify the predicted DC subtype composition in the query dataset:
plot.statepred.composition(ref = ref_DC, query = brown.seurat, labels.col = "functional.cluster")
Here we observe that spleen is largely enriched in cDC2_CLEC10A, cDC1 and AS-DC subtypes.
We can visually inspect marker gene profiles in query data vs. reference map, for each DC subtype:
<- c("CLEC9A", "XCR1", "CADM1", "CLEC10A", "FCGR2A", "FCER1A", "CD1C",
genes4radar "CD1A", "CD207", "LAMP3", "CCR7", "FSCN1", "GZMB", "LILRA4", "TCF4", "PPP1R14A",
"AXL", "S100A8", "S100A9", "VCAN", "FCN1", "ITGAX", "HLA-DRA", "HLA-DQA1", "HLA-DQB1")
<- genes4radar[genes4radar %in% rownames(brown.seurat)]
genes4radar_use
plot.states.radar(ref_DC, query = brown.seurat, genes4radar = genes4radar_use, min.cells = 20)
Interestingly, although DC3 are rare in the spleen (very low frequency in the composition plot), they display a distinctive gene expression profile with LAMP3, CCR7 and FSCN1. Cells predicted as MonoDCs (not described in the original work) display a mixed profile with DC2 (CLEC10A,FCGR2A) and Monocyte (S100A8,S100A9,VCAN,FCN1) marker genes.
Reference space projection mode
ProjecTILs also allows to embed the query data into the reference map UMAP space. To this end, we can run ProjecTILs in standard projection mode:
<- Run.ProjecTILs(brown.seurat, ref = ref_DC, filter.cells = F) brown.projected
|
| | 0%[1] "Using assay RNA for query"
[1] "Aligning query to reference map for batch-correction..."
Projecting corrected query onto Reference PCA space
Projecting corrected query onto Reference UMAP space
|
|======================================================================| 100%
<- unique(brown.projected$cell_type)
original.annotation <- list()
pll
for (c in original.annotation) {
<- subset(brown.projected, subset = cell_type == c)
sub <- plot.projection(ref_DC, query = sub, linesize = 0.5, pointsize = 0.5) +
pll[[c]] ggtitle(c)
}
wrap_plots(pll, ncol = 4)
We can verify a large agreement of original authors annotations and the area of the map where cells were projected. Interestingly, “Mitotic cDC1” and “Mitotic cDC2” were projected within the cDC1 and cDC2 clusters, respectively, irrespective of their proliferation activity. Instead, cells originally annotated as “CCR7+ cDC2” correspond, according to ProjecTILs, to a mix of DC3 (as shown above) and DC2s.
Overall, this analysis supports a consistent classification of spleen DC subtypes by ProjecTILs.
Further reading
Dataset original publication - Brown et al. (2019) Cell
ProjecTILs case studies - INDEX - Repository
The ProjecTILs method Andreatta et. al (2021) Nat. Comm. and code