scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Exhaustive T-cell repertoire sequencing of human peripheral blood samples reveals signatures of antigen selection and a directly measured repertoire size of at least 1 million clonotypes

01 May 2011-Genome Research (Cold Spring Harbor Lab)-Vol. 21, Iss: 5, pp 790-797
TL;DR: It is shown that the sensitivity of sequence-based repertoire profiling is limited by both sequencing depth and sequencing accuracy, and a new, directly measured, lower limit on individual T-cell repertoire size is established.
Abstract: Massively parallel sequencing is a useful approach for characterizing T-cell receptor diversity. However, immune receptors are extraordinarily difficult sequencing targets because any given receptor variant may be present in very low abundance and may differ legitimately by only a single nucleotide. We show that the sensitivity of sequence-based repertoire profiling is limited by both sequencing depth and sequencing accuracy. At two timepoints, 1 wk apart, we isolated bulk PBMC plus naive (CD45RA+/CD45RO-) and memory (CD45RA-/CD45RO+) T-cell subsets from a healthy donor. From T-cell receptor beta chain (TCRB) mRNA we constructed and sequenced multiple libraries to obtain a total of 1.7 billion paired sequence reads. The sequencing error rate was determined empirically and used to inform a high stringency data filtering procedure. The error filtered data yielded 1,061,522 distinct TCRB nucleotide sequences from this subject which establishes a new, directly measured, lower limit on individual T-cell repertoire size and provides a useful reference set of sequences for repertoire analysis. TCRB nucleotide sequences obtained from two additional donors were compared to those from the first donor and revealed limited sharing (up to 1.1%) of nucleotide sequences among donors, but substantially higher sharing (up to 14.2%) of inferred amino acid sequences. For each donor, shared amino acid sequences were encoded by a much larger diversity of nucleotide sequences than were unshared amino acid sequences. We also observed a highly statistically significant association between numbers of shared sequences and shared HLA class I alleles.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
21 Jun 2017-Nature
TL;DR: The GLIPH algorithm can analyse large numbers of TCR sequences and define TCR specificity groups shared by TCRs and individuals, which should greatly accelerate the analysis of T cell responses and expedite the identification of specific ligands.
Abstract: T cell receptor (TCR) sequences are very diverse, with many more possible sequence combinations than T cells in any one individual. Here we define the minimal requirements for TCR antigen specificity, through an analysis of TCR sequences using a panel of peptide and major histocompatibility complex (pMHC)-tetramer-sorted cells and structural data. From this analysis we developed an algorithm that we term GLIPH (grouping of lymphocyte interactions by paratope hotspots) to cluster TCRs with a high probability of sharing specificity owing to both conserved motifs and global similarity of complementarity-determining region 3 (CDR3) sequences. We show that GLIPH can reliably group TCRs of common specificity from different donors, and that conserved CDR3 motifs help to define the TCR clusters that are often contact points with the antigenic peptides. As an independent validation, we analysed 5,711 TCRβ chain sequences from reactive CD4 T cells from 22 individuals with latent Mycobacterium tuberculosis infection. We found 141 TCR specificity groups, including 16 distinct groups containing TCRs from multiple individuals. These TCR groups typically shared HLA alleles, allowing prediction of the likely HLA restriction, and a large number of M. tuberculosis T cell epitopes enabled us to identify pMHC ligands for all five of the groups tested. Mutagenesis and de novo TCR design confirmed that the GLIPH-identified motifs were critical and sufficient for shared-antigen recognition. Thus the GLIPH algorithm can analyse large numbers of TCR sequences and define TCR specificity groups shared by TCRs and individuals, which should greatly accelerate the analysis of T cell responses and expedite the identification of specific ligands.

697 citations

Journal ArticleDOI
TL;DR: The results suggest that a highly diverse repertoire is maintained despite thymic involution; however, peripheral fitness selection of T cells leads to repertoire perturbations that can influence the immune response in the elderly.
Abstract: T-cell receptor (TCR) diversity, a prerequisite for immune system recognition of the universe of foreign antigens, is generated in the first two decades of life in the thymus and then persists to an unknown extent through life via homeostatic proliferation of naive T cells. We have used next-generation sequencing and nonparametric statistical analysis to estimate a lower bound for the total number of different TCR beta (TCRB) sequences in human repertoires. We arrived at surprisingly high minimal estimates of 100 million unique TCRB sequences in naive CD4 and CD8 T-cell repertoires of young adults. Naive repertoire richness modestly declined two- to fivefold in healthy elderly. Repertoire richness contraction with age was even less pronounced for memory CD4 and CD8 T cells. In contrast, age had a major impact on the inequality of clonal sizes, as estimated by a modified Gini–Simpson index clonality score. In particular, large naive T-cell clones that were distinct from memory clones were found in the repertoires of elderly individuals, indicating uneven homeostatic proliferation without development of a memory cell phenotype. Our results suggest that a highly diverse repertoire is maintained despite thymic involution; however, peripheral fitness selection of T cells leads to repertoire perturbations that can influence the immune response in the elderly.

567 citations


Cites background from "Exhaustive T-cell repertoire sequen..."

  • ...(12) measured 1 million different TCRB sequences in a peripheral blood sample....

    [...]

Journal ArticleDOI
TL;DR: MIGEC (molecular identifier groups–based error correction), a strategy for high-throughput sequencing data analysis, allows for nearly absolute error correction while fully preserving the natural diversity of complex immune repertoires.
Abstract: A two-step error correction process for high throughput–sequenced T- and B-cell receptors allows the elimination of most errors while not diminishing the natural complexity of the repertoires.

390 citations

Journal ArticleDOI
21 Feb 2020-Science
TL;DR: The authors' single-cell transcriptome profile of the thymus across the human lifetime and across species provides a high-resolution census of T cell development within the native tissue microenvironment, and identifies novel subpopulations of human thymic fibroblasts and epithelial cells and located them in situ.
Abstract: The thymus provides a nurturing environment for the differentiation and selection of T cells, a process orchestrated by their interaction with multiple thymic cell types. We used single-cell RNA sequencing to create a cell census of the human thymus across the life span and to reconstruct T cell differentiation trajectories and T cell receptor (TCR) recombination kinetics. Using this approach, we identified and located in situ CD8αα+ T cell populations, thymic fibroblast subtypes, and activated dendritic cell states. In addition, we reveal a bias in TCR recombination and selection, which is attributed to genomic position and the kinetics of lineage commitment. Taken together, our data provide a comprehensive atlas of the human thymus across the life span with new insights into human T cell development.

323 citations


Cites background from "Exhaustive T-cell repertoire sequen..."

  • ...To date, most of our knowledge of VDJ recombination and repertoire biases has come from animal models and human peripheral blood analysis, with little comprehensive data on the human thymic TCR repertoire (22, 24, 25)....

    [...]

Journal ArticleDOI
TL;DR: Studies on the role of autoreactive T-cells that are generated secondary to molecular mimicry, the diversity of the T-cell receptor repertoires of auto-reactiveT-cells, the roles of exposure to cryptic antigens, the generation of autoimmune B-cell responses, the interaction of microbiota and chemical adjuvants with the host immune systems all provide clues in advancing the understanding of the molecular mechanisms involved in the evolving concept of Molecular mimicry.

296 citations

References
More filters
Journal ArticleDOI
TL;DR: The Clustal W and ClUSTal X multiple sequence alignment programs have been completely rewritten in C++ to facilitate the further development of the alignment algorithms in the future and has allowed proper porting of the programs to the latest versions of Linux, Macintosh and Windows operating systems.
Abstract: Summary: The Clustal W and Clustal X multiple sequence alignment programs have been completely rewritten in C++. This will facilitate the further development of the alignment algorithms in the future and has allowed proper porting of the programs to the latest versions of Linux, Macintosh and Windows operating systems. Availability: The programs can be run on-line from the EBI web server: http://www.ebi.ac.uk/tools/clustalw2. The source code and executables for Windows, Linux and Macintosh computers are available from the EBI ftp site ftp://ftp.ebi.ac.uk/pub/software/clustalw2/ Contact: clustalw@ucd.ie

25,325 citations


"Exhaustive T-cell repertoire sequen..." refers methods in this paper

  • ...The resulting sequence data were aligned against all available exon 2 and 3 nucleotide sequences from the 3.1.0 release of the IMGT/HLA database (Robinson et al. 2003) using ClustalW (Larkin et al. 2007)....

    [...]

Journal ArticleDOI
TL;DR: In this article, a base-calling program for automated sequencer traces, phred, with improved accuracy was proposed. But it was not shown to achieve a lower error rate than the ABI software, averaging 40%-50% fewer errors in the data sets examined independent of position in read, machine running conditions, or sequencing chemistry.
Abstract: The availability of massive amounts of DNA sequence information has begun to revolutionize the practice of biology. As a result, current large-scale sequencing output, while impressive, is not adequate to keep pace with growing demand and, in particular, is far short of what will be required to obtain the 3-billion-base human genome sequence by the target date of 2005. To reach this goal, improved automation will be essential, and it is particularly important that human involvement in sequence data processing be significantly reduced or eliminated. Progress in this respect will require both improved accuracy of the data processing software and reliable accuracy measures to reduce the need for human involvement in error correction and make human review more efficient. Here, we describe one step toward that goal: a base-calling program for automated sequencer traces, phred, with improved accuracy. phred appears to be the first base-calling program to achieve a lower error rate than the ABI software, averaging 40%-50% fewer errors in the data sets examined independent of position in read, machine running conditions, or sequencing chemistry.

7,627 citations


"Exhaustive T-cell repertoire sequen..." refers background in this paper

  • ...4 errors per kilobase, but when we added the requirements of (1) double-strand coverage, (2) minimum quality score (Ewing and Green 1998) of Q30, and (3) no...

    [...]

  • ...…raw single pass reads, we observed 9.4 errors per kilobase, but when we added the requirements of (1) double-strand coverage, (2) minimum quality score (Ewing and Green 1998) of Q30, and (3) no high-quality discrepancy between strands at any position, the error rate fell to 2.2 errors per kilobase....

    [...]

Journal ArticleDOI
TL;DR: The ability to estimate a probability of error for each base-call, as a function of certain parameters computed from the trace data, is developed and implemented in the base-calling program.
Abstract: Elimination of the data processing bottleneck in high-throughput sequencing will require both improved accuracy of data processing software and reliable measures of that accuracy. We have developed and implemented in our base-calling program phred the ability to estimate a probability of error for each base-call, as a function of certain parameters computed from the trace data. These error probabilities are shown here to be valid (correspond to actual error rates) and to have high power to discriminate correct base-calls from incorrect ones, for read data collected under several different chemistries and electrophoretic conditions. They play a critical role in our assembly program phrap and our finishing program consed.

5,334 citations


"Exhaustive T-cell repertoire sequen..." refers background in this paper

  • ...…raw single pass reads, we observed 9.4 errors per kilobase, but when we added the requirements of (1) double-strand coverage, (2) minimum quality score (Ewing and Green 1998) of Q30, and (3) no high-quality discrepancy between strands at any position, the error rate fell to 2.2 errors per kilobase....

    [...]

Journal ArticleDOI
04 Aug 1988-Nature
TL;DR: This view of T-cell recognition has implications for how the receptors might be selected in the thymus and how they (and immunoglobulins) may have arisen during evolution.
Abstract: The four distinct T-cell antigen receptor polypeptides (alpha, beta, gamma, delta) form two different heterodimers (alpha:beta and gamma:delta) that are very similar to immunoglobulins in primary sequence, gene organization and modes of rearrangement. Whereas antibodies have both soluble and membrane forms that can bind to antigens alone, T-cell receptors exist only on cell surfaces and recognize antigen fragments only when they are embedded in major histocompatibility complex (MHC) molecules. Patterns of diversity in T-cell receptor genes together with structural features of immunoglobulin and MHC molecules suggest a model for how this recognition might occur. This view of T-cell recognition has implications for how the receptors might be selected in the thymus and how they (and immunoglobulins) may have arisen during evolution.

2,858 citations

Journal ArticleDOI
TL;DR: This report documents the additions and revisions to the nomenclature of HLA specificities following the principles established in previous reports.
Abstract: The WHO Nomenclature Committee for Factors of the HLA System met following the 14th International HLA and Immunogenetics Workshop in Melbourne, Australia in December 2005 and Buzios, Brazil during the 15th International HLA and Immunogenetics Workshop in September 2008. This report documents the additions and revisions to the nomenclature of HLA specificities following the principles established in previous reports (1–18).

2,390 citations


Additional excerpts

  • ...Allele assignments (four-digit codes) (Marsh et al. 2010) were based on high-quality exact or synonymous matches at informative nucleotide positions....

    [...]