By EVOBYTE Your partner in bioinformatics
Introduction
If you’ve ever stared at a massive TCR repertoire file and wondered which clones actually “see” your target antigen, you’re not alone. Predicting antigen‑specific T cells sits at the intersection of immunology and machine learning, and it’s quickly becoming essential in TCR‑T therapy, immuno‑oncology biomarker work, and vaccine programs that rely on T cell responses. Yet, despite eye‑catching headlines, this is still a hard problem with stubborn data bottlenecks and biology that refuses to be oversimplified.
In this overview, we’ll ground the discussion in T cell receptor biology, explain why MHC class I and class II matter, and then walk through the current methods landscape. We’ll look at sequence‑based clustering and classification, structure‑based modeling, and where tools like GLIPH2, TCRNET, NetTCR, and TCRmodel2 fit in. Along the way, we’ll talk about why training data are limited, how that shapes model performance, and what a practical workflow looks like when you need to assess on‑ and off‑targets for a therapy or evaluate whether a vaccine or tumor actually elicited functional T cells.
TCRs, pMHC, and why MHC class I vs class II changes the question
T cell receptors (TCRs) are heterodimers, usually αβ, that engage short peptides bound in the groove of major histocompatibility complex (MHC) molecules—often called peptide–MHC, or pMHC. In humans, MHC molecules are encoded by HLA genes. At the interface, the most variable portions of the TCR, especially the CDR3 loops, drive specificity by contacting residues that are exposed from the bound peptide, while germline‑encoded regions help orient the receptor on the MHC scaffold. Because the peptide sits in the MHC groove, the same epitope can look different depending on the presenting HLA allele, and a single TCR may tolerate “mimic” peptides that preserve key contact features. That biological flexibility is both a blessing and a curse for prediction.
The class of MHC presenting the peptide further shapes the problem. MHC class I typically presents 8–10mer peptides on nearly all nucleated cells and pairs with CD8 T cells. MHC class II, expressed mainly by professional antigen‑presenting cells, binds longer and more variable peptides—often 13–25mers—and pairs with CD4 T cells. The class I groove is structurally closed at the ends and favors tight length constraints; class II is open at the ends, which means the same core can appear with different flanking residues. If you’re predicting specificity, those structural realities affect the feature space your model must learn and the way you curate training data. Cross‑presentation, allele polymorphism, and antigen processing add even more context that purely sequence‑based models don’t always capture.
Why predicting TCR–pMHC specificity matters in the clinic
The most immediate reason is safety. TCR‑T products target intracellular antigens presented on HLA, enabling access to tumor proteins that CAR‑T cannot reach. But with that power comes the risk of cross‑reactivity. High‑affinity engineered TCRs against tumor antigens like MAGE‑A3 have caused fatal cardiac toxicity by recognizing a titin‑derived peptide presented in the heart—a sobering reminder that “similar enough” can be clinically catastrophic. Robust in silico and wet‑lab off‑target screens are therefore non‑negotiable.
Prediction also matters for go/no‑go decisions. In immuno‑oncology, you want evidence that cancer‑reactive T cells are actually raised—before, during, and after treatment. In vaccine studies, you need to know if your design elicits the intended CD8 or CD4 responses across diverse HLA backgrounds. If you can prioritize likely pMHC targets and TCRs that recognize them, you accelerate assay design, triage epitopes for validation, and save precious patient samples for the experiments that truly de‑risk your program.
Why this is hard: data scarcity, bias, and biology
At first glance, the field looks data‑rich. We have public repositories of TCRs with known epitopes and HLA context, as well as vast bulk and single‑cell repertoires. But the data suited for supervised learning—paired αβ chains, known peptide, known HLA, and a clean binding or activation readout—are rare compared with the universe of possible interactions. Public specificity datasets also exhibit strong biases: they over‑represent common viral epitopes, class I more than class II, and HLA‑A*02:01 far more than many other alleles. Pairing information is often missing, and many TCRs were mapped with tetramers to only a handful of peptides, which constrains negative examples and inflates apparent performance if you don’t build the right splits. In practice, most models interpolate within familiar peptide families and high‑resource HLAs; true generalization to new epitopes, alleles, or organisms remains difficult.
Databases like VDJdb, McPAS‑TCR, and IEDB are invaluable, yet they highlight these constraints. They aggregate curated TCR–epitope–HLA associations across diseases, but the absolute number of validated TCR–peptide pairs is tiny compared with the theoretical search space. This is why we see models excel on certain benchmarks yet stumble in prospective tests for novel peptide–HLA contexts or for class II.
The methods landscape: from clustering to deep learning to structure
When people say “predict antigen‑specific T cells,” they often mean one of three things. First, clustering: grouping TCRs that likely share specificity based on CDR3 motifs and local sequence features. Second, binary classification: given a TCR and a peptide (and sometimes HLA), predict whether they bind. Third, structural modeling: generate a 3D model of the TCR–pMHC complex to rationalize or prioritize interactions. Each category answers a slightly different question, and mature pipelines frequently combine them.
A widely used clustering approach is GLIPH2, which scans CDR3β sequences for enriched local motifs and groups TCRs that likely recognize the same epitope. It scales to millions of sequences and has been applied to infectious disease, autoimmunity, and tumor studies to infer specificity groups and HLA restriction patterns. The output is hypothesis‑generating rather than a definitive binder list, but it’s an efficient way to triage candidates for experimental follow‑up. Recent benchmarking suggests that neighborhood‑enrichment methods such as ALICE and TCRNET can outperform GLIPH2 on several datasets, reminding us that repertoire context and local density signals hold real value when labels are sparse.
For direct TCR–peptide binding prediction, sequence‑based deep learning models have multiplied. NetTCR 2.2 focuses on class I peptides and accepts peptide plus CDR sequences (often both chains’ CDR1/2/3), producing a binding probability. Other public tools, like TCRex, provide epitope‑specific classifiers for defined panels of viral and cancer peptides and can be retrained on custom data. These models are practical when your peptides of interest overlap with their training space and your HLA context is supported. Their limitations surface when you switch allele, move to class II, or ask for true de novo generalization to unseen peptide families.
Structure‑aware approaches are gaining traction for exactly those hard cases. TCRmodel2 is a major update that leverages AlphaFold to fold and dock TCR–pMHC complexes more accurately and more quickly than a default AlphaFold pipeline. It offers a community server with built‑in visualization and confidence scores, making it feasible to compare poses across candidate peptides or TCRs before committing to costly experiments. Structural models don’t magically solve generalization, but they let you reason mechanistically about anchor positions, clashes, or permissive substitutions, and they complement sequence‑only scores with interpretable hypotheses.
What the clustering tools actually tell you
Clustering methods like GLIPH2 summarize repertoire structure and point to specificity groups, but they don’t assert that any particular TCR binds a particular peptide in your HLA context. Treat them as traffic maps—where do interesting flows converge?—and then validate. In contrast, sequence‑based classifiers output a binding score for a specific TCR–peptide pair; they’re strongest when your query sits near their training distribution. Be alert to “peptide memorization” and optimistic cross‑validation splits; when possible, use peptide‑ or epitope‑held‑out benchmarks before trusting a model on novel targets.
Structural modeling with TCRmodel2 gives you a pose and confidence, which you can mine for side‑chain contacts or clashes. But even a plausible pose can fail in cells if antigen processing, peptide abundance, or HLA stability don’t cooperate. Conversely, a mediocre score can still activate if your antigen is presented at high density or if the TCR has favorable kinetics. No single layer is definitive; consistency across layers is the signal you should look for.
The current state of prediction—and where it’s going next
The field has moved from single‑method enthusiasm to pragmatic, multi‑modal pipelines. On the sequence side, newer architectures and pretraining on massive TCR corpora are improving generalization by capturing grammar in CDR3s and V/J usage patterns. On the structure side, specialized fold‑and‑dock methods are closing the gap between sequence‑only scores and mechanistic hypotheses. Meanwhile, repertoire‑context methods like TCRNET remind us that enrichment in the local neighborhood often carries stronger evidence than any single motif alone, especially when labels are scarce. Independent benchmarks have started to compare these streams head‑to‑head, and results vary by dataset, which is a healthy sign: there’s no free lunch here, just better matched tools for specific questions.
What’s most exciting is the integration with upstream and downstream biology. Immunopeptidomics defines what’s actually presented on a given HLA; MHC‑binding and processing predictors select realistic peptides; TCR‑aware models then rank likely binders; and high‑throughput multimers and single‑cell assays close the loop. As standardized datasets grow—particularly for class II, diverse HLAs, and paired αβ TCRs—we should expect steadier gains and fewer surprises. For therapy, better off‑target prediction remains the north star. The MAGE‑A3/titin lesson has already shaped preclinical diligence, and models that scan proteomes for mimic peptides in relevant HLA contexts are becoming standard alongside in vitro testing.
Summary / Takeaways
Predicting antigen‑specific T cells is not about finding a single perfect model. It’s about layering evidence across biology‑aware tools. Start with clear immunological context: class I versus class II, HLA restriction, and the real peptide supply. Use repertoire‑level clustering to find specificity groups and sequence‑based classifiers to score concrete TCR–peptide pairs. Bring in structure when you need mechanistic plausibility, and never skip wet‑lab validation, especially if you are anywhere near the clinic.
If you’re building or buying a pipeline this quarter, consider this question: will your process still work when the peptide is new, the HLA is rare, and only a handful of paired αβ TCRs are available? If the answer is “yes,” you’re designing for the real world rather than the benchmark.
Further Reading
- GLIPH2: clustering TCRs by specificity at scale (Nature Biotechnology, 2020)
- NetTCR 2.2: sequence‑based prediction of TCR–peptide binding (DTU server)
- TCRmodel2: AlphaFold‑based modeling of TCR–pMHC complexes (Nucleic Acids Research)
- VDJdb and McPAS‑TCR: curated TCR–epitope and pathology‑associated repertoires
- Affinity‑enhanced TCR off‑target toxicity (Blood, 2013)
