TCRanker demonstration

Ranking clonal exhaustion and proliferation levels of TILs from PMEL mice

Introduction

TCRanker is a tool to predict the tumor-reactivity of TCRs. It takes as input scRNA-seq data paired with scTCR-seq data, filters cell type wanted (CD8+ T cells by default), and ranks TCRs in a tumor sample according to their likelihood to recognize tumor antigens. To rank TCRs, TCRanker evaluates T cell clonal expansion and T cell transcriptional features associated with tumor reactivity, mainly exhaustion level, in each T cell clonotype.

In this demonstration, We used the paired scRNA-seq and scTCR-seq of CD8+ TILs from C57BL/6 (n=4) and PMEL (n=3) mice bearing B16-F10 melanoma tumor from the study by Carmona et al., OncoImmunology (2020)


Installation

Firstly, install TCRanker. To filter high-quality T cells, TCRanker uses scGate, which should also be installed.

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
remotes::install_github("carmonalab/scGate")
remotes::install_github("carmonalab/TCRanker")
## 
## * checking for file ‘/private/var/folders/03/yl12ggwx1q11_5sw1m62v7jc0000gn/T/Rtmp5fRFnN/remotesc758773fcf9d/carmonalab-TCRanker-b9d82e3/DESCRIPTION’ ... OK
## * preparing ‘TCRanker’:
## * checking DESCRIPTION meta-information ... OK
## * checking for LF line-endings in source and make files and shell scripts
## * checking for empty or unneeded directories
## * building ‘TCRanker_0.1.8.tar.gz’
library(scGate)
library(TCRanker)

Data Set

To know how to prepare the input data set for TCRanker, please refer TCRanker preparation.

library(Seurat)

outDir <- paste0(getwd(),'/output') # Director for output files
download.file(url = 'https://figshare.com/ndownloader/files/35983556?private_link=cc77615ad7bb69dd59ca',
              destfile = paste0(outDir,'/query.b16.RDS')) 

query.b16 <- readRDS(paste0(outDir,'/query.b16.RDS'))

TCRanker

Now we could apply TCRanker over our data set.

The output data frame would have the following columns:

columns description
clonotype Clonotypes, the form depends on the input
group Group info., optional, depends on whether in the input
size Clonal size, the count of cells, also group-related (if provided)
freq Clonal frequency, also group-related (if provided)
score 1 Clonal state score by the first gene signature
ranking 1 Clonal state ranking by the first gene signature
score 2 Clonal state score by the second gene signature
ranking 2 Clonal state ranking by the second gene signature
More scores and rankings… (if more custom gene signatures provided)
Table 1


Parameters: querytcr, filterCell, exhaustion and proliferation

Firstly, let’s see the default behaviours of TCRanker with only basic parameters provided.

query is the Seurat or SingleCellExperiment object with the expression matrix and, optimally, TCR as well.

Besides the query data set, the column names of TCR clonotype in meta.data also need to be specified to tcr. It’s also possible to present the clonotype as a separated vector, but such practice is not recommended.

ranking.B16 <- TCRanker(query = query.b16, tcr = "clonotype")
## Filtering CD8T Cells. Set filterCell=NULL to disable pre-filtering
## 
## ### Detected a total of 4 non-pure cells for selected signatures - 0.08% cells marked for removal (active.ident)
## 1930 (1926 with NA clonotype, 4 impure) out of 7174 ( 26.9% ) cells removed.
## Using assay RNA
## Using TCR meta.data clonotype
## Using default mouse gene signature.
## Calculating individual scores with UCell
## Done

As shown above, by default, TCRanker will filter the CD8+ T cells first. It could be disabled by setting filterCell='none'. If you are interested, you could also choose filterCell='CD4T'; filterCell='NK' or any other pre-defined models provided by scGate

We know that our data set contains CD8+ T cells only (and it’s proven by how little amount of cells filtered). So we’ll set filterCell='none' in later demonstration.

And this is the output data frame we’d have:

Table 2


As shown above, by default, TCRanker provides the scores and rankings for exhaustion and proliferation. You could exclude any of them simply by setting exhaustion=FALSE and proliferation=FALSE. Please be noted that it is mandatory to have at least one set of gene signature. So if you exclude both of them, make sure to provide a custom gene signature instead.

Parameters: group, minClonSize

The parameter group would lead TCRanker to add a column of group and, sequentially, sub-divide the entry by the group, (i.e. the cells with the same TCR clonotype but from different groups might be divided into two or more sub-group and listed separately in the output.)

It’s optional, yet recommended to be included in order to obtain more meaningful result if your data is from multiple individuals or sample groups.

ranking.b16 <- TCRanker(query = query.b16, tcr = "clonotype", 
                        group = "Sample", filterCell = NULL,
                        proliferation = F)
Table 3


This is the output with group. Among 80 entries, there are only 66 unique clonotypes. You should have noticed that the number of unique clonotypes decreased in table 3 comparing to table 2. This is due to the minClonSize=5 setting: when sub-divided into smaller clonal groups, some might fall below the threshold of 5 cells to be considered as meaningfully expanded. The number is changeable.

Parameters: signature

In order to extend the functionality, custom gene signatures are also allowed to analyse more wanted clonal states.

Here we use the signature of CD8+ naive-like and precursor-exhausted Andreatta et al., 2021 as examples:

naiveLike <- c('Ccr7', 'Il7r', 'Sell', 'Tcf7', 'Txk', 'S1pr1', 'Lef1', 'Satb1')

Tpex <- c('Lag3', 'Xcl1', 'Crtam', 'Ifng', 'Ccl4', 'Pdcd1', 'Dusp4', 'Cd8a', 
          'Zeb2', 'Nr4a2', 'Sla', 'Nkg7', 'Tigit', 'Ctsw', 'Tnfrsf9', 'Tox',
          'Lyst', 'Tnfsf4', 'Ccl3', 'Gzmb', 'Rab27a', 'Prf1', 'Cd70', 'Plscr1',
          'Cxcl13')

customSignatures <- list(naiveLike = naiveLike, Tpex = Tpex)

Wrap your custom signatures in a list. And remember to name the entries to avoid ambiguity in the output.

Having only one set of signature, you could also use a simple vector, and the state would be called “user” in the output.

ranking.b16 <- TCRanker(query = query.b16, tcr = "clonotype", 
                        group = "Sample", filterCell = NULL,
                        signature = customSignatures)
#Save the RDS for later usage
saveRDS(ranking.b16, file = paste0(outDir,'/ranking.b16.RDS'))
Table 4


Other parameters

  • keepObject: Logical, to return the SingleCellExperiment/Saurat object after cell filtering or not (returned together with the ranking data frame in a list). FALSE by default.

  • assay:
    Name of expression data assay, By default “counts” for SingleCellExperiment and “RNA” for Seurat.

  • species: Charactor, “mouse” or “human”, will be auto-detected if omitted. Currently only these two are supported.

  • FUN: Function used to aggregate scores of the same clonotype. It could be mean, median or customized functions that take a numeric vector or list as input and return a single numeric. By default, it uses “mean”.

  • strictFilter: Logical, to exclude impure cells in pure clonotype or not. Only valid when filterCell is on. TRUE by default.


Now we have the clonal ranking of a certain cell state. In order to bridge the current format of clonotypes with further lab validation or comparison, you might need to assemble the TCR to the full sequence instead of a simple combination of cdr3 chains.

There should be plenty of tools available. For the demonstration of how to generate the full sequence with stitchr by Heather et al., Nucleic Acids Research (2022), you can check TCRanker.stitchr.