TCRanker to stitchr
Introduction
In this demonstration, We used the result of TCRanker from paired scRNA-seq and scTCR-seq of CD8+ TILs from C57BL/6 (n=4) and PMEL (n=3) mice bearing B16-F10 melanoma tumor from the study by Carmona et al., OncoImmunology (2020)
We’ll use stitchr by Heather
et al., Nucleic Acids Research (2022) to assemble the full
nucleotide sequence of TCRs.
From the instruction of stitchr on Github page,
Data Set
To know how to use TCRanker, please refer TCRanker demonstration.
inDir <- paste0(getwd(),'/input') # Directory for input files
outDir <- paste0(getwd(),'/output') # Director for output files
query.b16 <- readRDS(file = paste0(outDir,'/query.b16.RDS'))
ranking.b16 <- readRDS(file = paste0(outDir,'/ranking.b16.RDS'))Format Conversion
Here is an illustration based on our data set specifically. You might need to adjust the workflow a bit according to the format in detail of your data at hand.
In our sample data set. There are full strings of genes and cdr3
sequence in the column CTstrict.
get.genes <- function(clonotype, query){
CTstrict <- query$CTstrict[query$clonotype==clonotype & !is.na(query$clonotype)]
genes <- unique(CTstrict)
genes <- strsplit(genes, split = c("_"))
gene.seq <- c(strsplit(genes[[1]][1],split = ".", fixed = T)[[1]],
genes[[1]][2],
strsplit(genes[[1]][3],split = ".", fixed = T)[[1]],
genes[[1]][4])
return(gene.seq)
}
gene.list <- vapply(ranking.b16$clonotype, FUN = get.genes, FUN.VALUE = character(9), query=query.b16)## Loading required package: SeuratObject
## Attaching sp
Stitchr support high-throughput on multiple and paired TCRs. And the required format is as followed
stitchr.input <- data.frame(TCR_name = colnames(gene.list),
TRAV = gene.list[1,],
TRAJ = gene.list[2,],
TRA_CDR3 = gene.list[4,],
TRBV = gene.list[5,],
TRBJ = gene.list[6,],
TRB_CDR3 = gene.list[9,],
TRAC = gene.list[3,],
TRBC = gene.list[8,],
TRA_leader = rep("", ncol(gene.list)),
TRB_leader = rep("", ncol(gene.list)),
Linker = rep("", ncol(gene.list)),
Link_order = rep("", ncol(gene.list)),
TRA_5_prime_seq = rep("", ncol(gene.list)),
TRA_3_prime_seq = rep("", ncol(gene.list)),
TRB_5_prime_seq = rep("", ncol(gene.list)),
TRB_3_prime_seq = rep("", ncol(gene.list)))Make sure you have all the columns as above. Stitchr is very strict concerning the input format.
write.table(stitchr.input, file = paste0(inDir, "/stitchr.input.tsv"),
row.names = F, sep = "\t", quote = F)Now with stitchr.input.tsv file, you can run thimble.py
(script from stitchr to run with multiple TCRs).
Make sure you have changed your directory to the /Scripts as required by stitchr.
An example:
python3 thimble.py -in ~/TCRanker.demo/input/stitchr.input.tsv -o ~/TCRanker.demo/output/stitchr.output.tsv