Detecting CD8 TPEX with ProjecTILs

In this tutorial we highlight ProjecTILs ability to detect CD8+ “TPEX” (Progenitor of exhausted), a rare population of memory CD8+ T cells that sustain CD8 T cell immunity in the context of chronic viral infection and cancer. These CD8 T cells are typically defined by co-expression of TCF1, PD-1 and TOX. They are of primary importance in cancer therapy, as they are thought to renew the pool of terminally exhausted CD8 T cells in response to PD-1/PD-L1 blockade Siddiqui et al.

Tumor T cell differentiation model, going through an intermediate state of progenitor exhausted (TPEX). Figure from Andreatta et al. 2021.

ProjecTILs classifies cells by projecting them onto a reference map. Here we will use our reference map of human tumor-infiltrating CD8 T cells. You can find more information about this map here.

Why using ProjecTILs to classify cell subtypes?

Obtain consistent cell annotations across datasets
Classify cells into reference subtypes, irrespective of activation of transient gene programs, such as cell cycle.
Avoid use of subjective, dataset-specific parameters, such as the selection of highly variable genes, number of clusters, etc.
Projection is robust to batch effects, single-cell technologies, sequencing depth (for more information, please read ProjecTILs paper - Andreatta et al. 2021).

What happens if the input data contains cells that do not match the the reference map cell type?

If the input data contain cell types not included in the reference map(eg. CD4 T cells if the reference is for CD8 T cells) they automatically get filtered-out.

By default, both Run.ProjecTILs() and ProjecTILs.classifier() have the parameter filter.cells set as TRUE. This means that cells out of reference will be filtered-out using the built-in scGate model. This model is stored in the slot misc of the reference Seurat object: ref@misc$scGate. You can custom this filtering by amending this slot using scGate grammar.

Human CD8 TIL reference

First, let’s have a look at the reference map.

# Load the reference
options(timeout = max(900, getOption("timeout")))
#download.file("https://figshare.com/ndownloader/files/38921366", destfile = "CD8T_human_ref_v1.rds")
ref.cd8 <- load.reference.map("CD8T_human_ref_v1.rds")

## [1] "Loading Custom Reference Atlas..."
## [1] "Loaded Custom Reference map Human CD8 TILs"

# Setup colors
mycols <- ref.cd8@misc$atlas.palette

# DimPlot
DimPlot(ref.cd8,  group.by = 'functional.cluster', label = T, repel = T, cols = mycols) + theme(aspect.ratio = 1)

Here are the different T cell subsets defined in the map:

CD8.NaiveLike: Antigen-naive T cells
CD8.CM: Central Memory T cells
CD8.EM: Effector Memory
CD8.TEMRA: Effector Memory cells re-expressing CD45RA. Sometimes called Short Lived Effectors (SLEC), or Cytotoxic effectors
CD8.TPEX: Progenitors exhausted T cells
CD8.TEX: Exhausted T cells
CD8.MAIT: Mucosal-associated invariant T cells, innate-like T cells defined by their semi-invariant αβ T cell receptor

Let’s check Differentially Expressed Genes (DEGs) between subtypes and verify expected marker genes:

# Compute DEGs
DefaultAssay(ref.cd8) <- "RNA"
ref.cd8 <- NormalizeData(ref.cd8)
markers <- FindAllMarkers(object = ref.cd8, only.pos = TRUE, assay = "RNA")

# Remove TCR genes
tcr.genes <- SignatuR::GetSignature(SignatuR$Hs$Compartments$TCR)
markers <- markers %>% filter(!gene %in% unname(tcr.genes))
markers %>% group_by(cluster) %>% top_n(n = 3, wt = avg_log2FC) -> top3

# Plot heatmap
VlnPlot(ref.cd8, assay = "RNA", features  = top3$gene, cols = mycols, stack = T, flip = T, fill.by = "ident") + NoLegend()

Here are some marker genes to help manually identify TPEX:

Positive markers: TCF7, CD200, CRTAM, GNG4, TOX, LEF1, CCR7, CXCL13, XCL1, XCL2

Negative markers: GZMB, NKG7, PRF1, HAVCR2, CCL5, GZMA

Let’s display 6 positive and 6 negative TPEX markers, that are especially useful to distinguish TPEX from closely related TEX.

DefaultAssay(ref.cd8) <- 'RNA'
FeaturePlot(ref.cd8, features = c('TCF7','XCL1','XCL2',"CXCL13","TOX","CRTAM","GZMB","NKG7","CCL5","HAVCR2","PRF1","GZMA"), ncol = 3, pt.size = 0.2, order = T, cols = pals::coolwarm()) & NoLegend()

Despite their great releveance, TPEX are often missed in tumor scRNA-seq studies.

Detecting TPEX in Gueguen et al. 2021

Setup data

#download.file("https://figshare.com/ndownloader/files/39082049", destfile = "gueguen.cd3.Rds")
gueguen.cd3 <- readRDS("gueguen.cd3.Rds")
gueguen.cd3$seurat_clusters <- Idents(gueguen.cd3)

Projection

Thanks to automatic scGate filtering, only the CD8 clusters (upper part of the UMAP) are annotated.

# Projection
DefaultAssay(gueguen.cd3) <- "RNA"
gueguen.cd3 <- ProjecTILs.classifier(gueguen.cd3, ref = ref.cd8, filter.cells = T, split.by = 'orig.ident', ncores = 6)
table(gueguen.cd3$functional.cluster)

## 
##        CD8.CM        CD8.EM      CD8.MAIT CD8.NaiveLike     CD8.TEMRA 
##          2132          3181           147           238           276 
##       CD8.TEX      CD8.TPEX 
##          3340           337

DimPlot(gueguen.cd3, order = T,  label = T, repel = T)

DimPlot(gueguen.cd3, group.by = 'functional.cluster', order = T, cols = mycols, label = T, repel = T)

We can check specific TPEX genes to confirm that identities match.

DefaultAssay(gueguen.cd3) <- 'RNA'
FeaturePlot(gueguen.cd3, features = c('XCL1','XCL2'), ncol = 2, pt.size = 0.5, order = T, cols = pals::coolwarm()) & NoLegend()

# Radar plots
p <- plot.states.radar(ref.cd8, query = gueguen.cd3, min.cells = 10, genes4radar = c('LEF1', "TCF7", "CCR7", "GZMK", "FGFBP2",'FCGR3A','ZNF683','ITGAE', "CRTAM", "CD200",'GNG4', "HAVCR2", "TOX", "ENTPD1", 'TYROBP','KIR2DL1'), return = T) 
wrap_plots(p) + theme_bw()

We can see that the previously homogeneous cluster CD8-LAYN is actually composed of two subsets: CD8.TEX and CD8.TPEX.

How to assess quality of mapping?

The first thing we recommend to do is verifying the expression of expected marker genes. In addition, ProjecTILs provides multiple tools to help the researcher decide if the projection and classification are accurate (e.g. https://carmonalab.github.io/ProjecTILs_CaseStudies/novelstate.html).

Detecting TPEX in Yost et al. 2019

Setup data

# Load data
#download.file("https://figshare.com/ndownloader/files/39109277", destfile = "Yost.cd3.Rds")
Yost.cd3 <- readRDS("Yost.cd3.Rds")

# Normalize data
Yost.cd3 <- NormalizeData(Yost.cd3)
Yost.cd3 <- ScaleData(Yost.cd3)

## Centering and scaling data matrix

# DimPlots
DimPlot(Yost.cd3, reduction = 'umap', group.by = 'cluster', label = T)

DimPlot(Yost.cd3, reduction = 'umap', group.by = 'patient', label = T, repel = T)

Projection

As this dataset is a mix between CD4 and CD8 T cells, we will keep the parameter filter.cells as TRUE to keep only CD8+ T cells.

DefaultAssay(Yost.cd3) <- "RNA"
Yost.cd3 <- ProjecTILs.classifier(Yost.cd3, ref = ref.cd8, filter.cells = T, split.by = 'patient', ncores = 6)
table(Yost.cd3$functional.cluster)

## 
##        CD8.CM        CD8.EM      CD8.MAIT CD8.NaiveLike     CD8.TEMRA 
##          3806          5296           316           438           965 
##       CD8.TEX      CD8.TPEX 
##          2370           499

DimPlot(Yost.cd3, group.by = 'functional.cluster', order = T, cols = mycols, label = T, repel = T)

Here again we detect TPEX. Let’s check globally how the expression profiles of marker genes look.

# Radar plots
p <- plot.states.radar(ref.cd8, query = Yost.cd3, min.cells = 10, genes4radar = c('LEF1', "TCF7", "CCR7", "GZMK", "FGFBP2",'FCGR3A','ZNF683','ITGAE', "CRTAM", "CD200",'GNG4', "HAVCR2", "TOX", "ENTPD1", 'TYROBP','KIR2DL1'), return = T) 
wrap_plots(p) + theme_bw()

We can see that predicted CD8.TPEX displey consistent marker gene profiles. In the authors’ original UMAP space, however, TPEX are found scattered. This is because activation signals and other confounding factors were contributing to defining the UMAP space. Reference-based annotation uncovers cell type signals masked by the activation program. If you are interested in recovering cell type identities hidden by transient cell states, you can read more in the corresponding tutorial.

Session Info

sessionInfo()

## R version 4.2.1 (2022-06-23)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Ventura 13.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices datasets  utils     methods   base     
## 
## other attached packages:
##  [1] plotly_4.10.1          EnhancedVolcano_1.14.0 pheatmap_1.0.12       
##  [4] UCell_2.2.0            scales_1.2.1           ggrepel_0.9.2         
##  [7] SignatuR_0.1.0         gridExtra_2.3          ProjecTILs_3.0.2      
## [10] patchwork_1.1.2        GEOquery_2.66.0        Biobase_2.58.0        
## [13] BiocGenerics_0.44.0    data.table_1.14.6      STACAS_2.0.0          
## [16] scGate_1.4.1           forcats_0.5.2          stringr_1.5.0         
## [19] dplyr_1.0.10           purrr_1.0.1            readr_2.1.3           
## [22] tidyr_1.2.1            tibble_3.1.8           ggplot2_3.4.0         
## [25] tidyverse_1.3.2        SeuratObject_4.1.3     Seurat_4.3.0          
## 
## loaded via a namespace (and not attached):
##   [1] utf8_1.2.2                  spatstat.explore_3.0-5     
##   [3] reticulate_1.27             tidyselect_1.2.0           
##   [5] htmlwidgets_1.6.1           grid_4.2.1                 
##   [7] BiocParallel_1.32.5         Rtsne_0.16                 
##   [9] munsell_0.5.0               codetools_0.2-18           
##  [11] ica_1.0-3                   umap_0.2.9.0               
##  [13] future_1.30.0               miniUI_0.1.1.1             
##  [15] withr_2.5.0                 spatstat.random_3.1-2      
##  [17] colorspace_2.1-0            progressr_0.13.0           
##  [19] highr_0.10                  knitr_1.41                 
##  [21] rstudioapi_0.14             stats4_4.2.1               
##  [23] SingleCellExperiment_1.20.0 ROCR_1.0-11                
##  [25] tensor_1.5                  listenv_0.9.0              
##  [27] labeling_0.4.2              MatrixGenerics_1.10.0      
##  [29] GenomeInfoDbData_1.2.9      polyclip_1.10-4            
##  [31] farver_2.1.1                parallelly_1.34.0          
##  [33] vctrs_0.5.2                 generics_0.1.3             
##  [35] xfun_0.36                   timechange_0.2.0           
##  [37] R6_2.5.1                    GenomeInfoDb_1.34.7        
##  [39] rmdformats_1.0.4            pals_1.7                   
##  [41] bitops_1.0-7                spatstat.utils_3.0-1       
##  [43] cachem_1.0.6                DelayedArray_0.24.0        
##  [45] assertthat_0.2.1            promises_1.2.0.1           
##  [47] googlesheets4_1.0.1         gtable_0.3.1               
##  [49] globals_0.16.2              goftest_1.2-3              
##  [51] rlang_1.0.6                 splines_4.2.1              
##  [53] lazyeval_0.2.2              gargle_1.2.1               
##  [55] dichromat_2.0-0.1           spatstat.geom_3.0-5        
##  [57] broom_1.0.2                 BiocManager_1.30.19        
##  [59] yaml_2.3.7                  reshape2_1.4.4             
##  [61] abind_1.4-5                 modelr_0.1.10              
##  [63] backports_1.4.1             httpuv_1.6.8               
##  [65] tools_4.2.1                 bookdown_0.32              
##  [67] ellipsis_0.3.2              jquerylib_0.1.4            
##  [69] RColorBrewer_1.1-3          ggridges_0.5.4             
##  [71] Rcpp_1.0.10                 plyr_1.8.8                 
##  [73] zlibbioc_1.44.0             RCurl_1.98-1.9             
##  [75] openssl_2.0.5               deldir_1.0-6               
##  [77] pbapply_1.7-0               cowplot_1.1.1              
##  [79] S4Vectors_0.36.1            zoo_1.8-11                 
##  [81] SummarizedExperiment_1.28.0 haven_2.5.1                
##  [83] cluster_2.1.4               fs_1.6.0                   
##  [85] magrittr_2.0.3              RSpectra_0.16-1            
##  [87] scattermore_0.8             lmtest_0.9-40              
##  [89] reprex_2.0.2                RANN_2.6.1                 
##  [91] googledrive_2.0.0           fitdistrplus_1.1-8         
##  [93] matrixStats_0.63.0          hms_1.1.2                  
##  [95] mime_0.12                   evaluate_0.20              
##  [97] xtable_1.8-4                readxl_1.4.1               
##  [99] IRanges_2.32.0              compiler_4.2.1             
## [101] maps_3.4.1                  KernSmooth_2.23-20         
## [103] crayon_1.5.2                htmltools_0.5.4            
## [105] later_1.3.0                 tzdb_0.3.0                 
## [107] lubridate_1.9.0             DBI_1.1.3                  
## [109] dbplyr_2.3.0                MASS_7.3-58.2              
## [111] data.tree_1.0.0             Matrix_1.5-3               
## [113] cli_3.6.0                   parallel_4.2.1             
## [115] igraph_1.3.5                GenomicRanges_1.50.2       
## [117] pkgconfig_2.0.3             sp_1.6-0                   
## [119] spatstat.sparse_3.0-0       xml2_1.3.3                 
## [121] bslib_0.4.2                 XVector_0.38.0             
## [123] rvest_1.0.3                 digest_0.6.31              
## [125] pracma_2.4.2                sctransform_0.3.5          
## [127] RcppAnnoy_0.0.20            spatstat.data_3.0-0        
## [129] rmarkdown_2.20              cellranger_1.1.0           
## [131] leiden_0.4.3                uwot_0.1.14                
## [133] shiny_1.7.4                 lifecycle_1.0.3            
## [135] nlme_3.1-161                jsonlite_1.8.4             
## [137] BiocNeighbors_1.16.0        mapproj_1.2.9              
## [139] askpass_1.1                 viridisLite_0.4.1          
## [141] limma_3.54.0                fansi_1.0.4                
## [143] pillar_1.8.1                lattice_0.20-45            
## [145] fastmap_1.1.0               httr_1.4.4                 
## [147] survival_3.5-0              glue_1.6.2                 
## [149] png_0.1-8                   stringi_1.7.12             
## [151] sass_0.4.5                  renv_0.15.5                
## [153] irlba_2.3.5.1               future.apply_1.10.0