Adaptive immunity is driven by the expansion, somatic hypermutation, and selection of B cell clones. Each clone is the progeny of a single B cell responding to Ag, with diversified Ig receptors. ...These receptors can now be profiled on a large scale by next-generation sequencing. Such data provide a window into the microevolutionary dynamics that drive successful immune responses and the dysregulation that occurs with aging or disease. Clonal relationships are not directly measured, but they must be computationally inferred from these sequencing data. Although several hierarchical clustering-based methods have been proposed, they vary in distance and linkage methods and have not yet been rigorously compared. In this study, we use a combination of human experimental and simulated data to characterize the performance of hierarchical clustering-based methods for partitioning sequences into clones. We find that single linkage clustering has high performance, with specificity, sensitivity, and positive predictive value all >99%, whereas other linkages result in a significant loss of sensitivity. Surprisingly, distance metrics that incorporate the biases of somatic hypermutation do not outperform simple Hamming distance. Although errors were more likely in sequences with short junctions, using the entire dataset to choose a single distance threshold for clustering is near optimal. Our results suggest that hierarchical clustering using single linkage with Hamming distance identifies clones with high confidence and provides a fully automated method for clonal grouping. The performance estimates we develop provide important context to interpret clonal analysis of repertoire sequencing data and allow for rigorous testing of other clonal grouping algorithms.
Analysis of antibody repertoires by high-throughput sequencing is of major importance in understanding adaptive immune responses. Our knowledge of variations in the genomic loci ...encoding immunoglobulin genes is incomplete, resulting in conflicting VDJ gene assignments and biased genotype and haplotype inference. Haplotypes can be inferred using IGHJ6 heterozygosity, observed in one third of the people. Here, we propose a robust novel method for determining VDJ haplotypes by adapting a Bayesian framework. Our method extends haplotype inference to IGHD- and IGHV-based analysis, enabling inference of deletions and copy number variations in the entire population. To test this method, we generated a multi-individual data set of naive B-cell repertoires, and found allele usage bias, as well as a mosaic, tiled pattern of deleted IGHD and IGHV genes. The inferred haplotypes may have clinical implications for genetic disease predispositions. Our findings expand the knowledge that can be extracted from antibody repertoire sequencing data.
Driven by dramatic technological improvements, large-scale characterization of lymphocyte receptor repertoires via high-throughput sequencing is now feasible. Although promising, the high germline ...and somatic diversity, especially of B-cell immunoglobulin repertoires, presents challenges for analysis requiring the development of specialized computational pipelines. We developed the REpertoire Sequencing TOolkit (pRESTO) for processing reads from high-throughput lymphocyte receptor studies. pRESTO processes raw sequences to produce error-corrected, sorted and annotated sequence sets, along with a wealth of metrics at each step. The toolkit supports multiplexed primer pools, single- or paired-end reads and emerging technologies that use single-molecule identifiers. pRESTO has been tested on data generated from Roche and Illumina platforms. It has a built-in capacity to parallelize the work between available processors and is able to efficiently process millions of sequences generated by typical high-throughput projects.
pRESTO is freely available for academic use. The software package and detailed tutorials may be downloaded from http://clip.med.yale.edu/presto.
Myasthenia gravis (MG) is a prototypical B cell-mediated autoimmune disease affecting 20-50 people per 100,000. The majority of patients fall into two clinically distinguishable types based on ...whether they produce autoantibodies targeting the acetylcholine receptor (AChR-MG) or muscle specific kinase (MuSK-MG). The autoantibodies are pathogenic, but whether their generation is associated with broader defects in the B cell repertoire is unknown. To address this question, we performed deep sequencing of the BCR repertoire of AChR-MG, MuSK-MG, and healthy subjects to generate ∼518,000 unique V
and V
sequences from sorted naive and memory B cell populations. AChR-MG and MuSK-MG subjects displayed distinct gene segment usage biases in both V
and V
sequences within the naive and memory compartments. The memory compartment of AChR-MG was further characterized by reduced positive selection of somatic mutations in the V
CDR and altered V
CDR3 physicochemical properties. The V
repertoire of MuSK-MG was specifically characterized by reduced V-J segment distance in recombined sequences, suggesting diminished V
receptor editing during B cell development. Our results identify large-scale abnormalities in both the naive and memory B cell repertoires. Particular abnormalities were unique to either AChR-MG or MuSK-MG, indicating that the repertoires reflect the distinct properties of the subtypes. These repertoire abnormalities are consistent with previously observed defects in B cell tolerance checkpoints in MG, thereby offering additional insight regarding the impact of tolerance defects on peripheral autoimmune repertoires. These collective findings point toward a deformed B cell repertoire as a fundamental component of MG.
Adenosine-to-inosine (A-to-I) editing modifies RNA transcripts from their genomic blueprint. A prerequisite for this process is a double-stranded RNA (dsRNA) structure. Such dsRNAs are formed as part ...of the microRNA (miRNA) maturation process, and it is therefore expected that miRNAs are affected by A-to-I editing. Editing of miRNAs has the potential to add another layer of complexity to gene regulation pathways, especially if editing occurs within the miRNA-mRNA recognition site. Thus, it is of interest to study the extent of this phenomenon. Current reports in the literature disagree on its extent; while some reports claim that it may be widespread, others deem the reported events as rare. Utilizing a next-generation sequencing (NGS) approach supplemented by an extensive bioinformatic analysis, we were able to systematically identify A-to-I editing events in mature miRNAs derived from human brain tissues. Our algorithm successfully identified many of the known editing sites in mature miRNAs and revealed 17 novel human sites, 12 of which are in the recognition sites of the miRNAs. We confirmed most of the editing events using in vitro ADAR overexpression assays. The editing efficiency of most sites identified is very low. Similar results are obtained for publicly available data sets of mouse brain-regions tissues. Thus, we find that A-to-I editing does alter several miRNAs, but it is not widespread.
Multiple sclerosis (MS) is an inflammatory disease of the central nervous system (CNS) characterized by autoimmune-mediated demyelination and neurodegeneration. The CNS of patients with MS harbors ...expanded clones of antigen-experienced B cells that reside in distinct compartments including the meninges, cerebrospinal fluid (CSF), and parenchyma. It is not understood whether this immune infiltrate initiates its development in the CNS or in peripheral tissues. B cells in the CSF can exchange with those in peripheral blood, implying that CNS B cells may have access to lymphoid tissue that may be the specific compartment(s) in which CNS-resident B cells encounter antigen and experience affinity maturation. Paired tissues were used to determine whether the B cells that populate the CNS mature in the draining cervical lymph nodes (CLNs). High-throughput sequencing of the antibody repertoire demonstrated that clonally expanded B cells were present in both compartments. Founding members of clones were more often found in the draining CLNs. More mature clonal members derived from these founders were observed in the draining CLNs and also in the CNS, including lesions. These data provide new evidence that B cells traffic freely across the tissue barrier, with the majority of B cell maturation occurring outside of the CNS in the secondary lymphoid tissue. Our study may aid in further defining the mechanisms of immunomodulatory therapies that either deplete circulating B cells or affect the intrathecal B cell compartment by inhibiting lymphocyte transmigration into the CNS.
Although structural studies of individual T cell receptors (TCRs) have revealed important roles for both the α and β chain in directing MHC and antigen recognition, repertoire-level immunogenomic ...analyses have historically examined the β chain alone. To determine the amount of useful information about TCR repertoire function encoded within αβ pairings, we analyzed paired TCR sequences from nearly 100,000 unique CD4
and CD8
T cells captured using two different high-throughput, single-cell sequencing approaches. Our results demonstrate little overlap in the healthy CD4
and CD8
repertoires, with shared TCR sequences possessing significantly shorter CDR3 sequences corresponding to higher generation probabilities. We further utilized tools from information theory and machine learning to show that while α and β chains are only weakly associated with lineage, αβ pairings appear to synergistically drive TCR-MHC interactions. Vαβ gene pairings were found to be the TCR feature most informative of T cell lineage, supporting the existence of germline-encoded paired αβ TCR-MHC interaction motifs. Finally, annotating our TCR pairs using a database of sequences with known antigen specificities, we demonstrate that approximately a third of the T cells possess α and β chains that each recognize different known antigens, suggesting that αβ pairing is critical for the accurate inference of repertoire functionality. Together, these findings provide biological insight into the functional implications of αβ pairing and highlight the utility of single-cell sequencing in immunogenomics.
The adaptive immune system confers protection by generating a diverse repertoire of antibody receptors that are rapidly expanded and contracted in response to specific targets. Next-generation DNA ...sequencing now provides the opportunity to survey this complex and vast repertoire. In the present work, we describe a set of tools for the analysis of antibody repertoires and their application to elucidating the dynamics of the response to viral vaccination in human volunteers. By analyzing data from 38 separate blood samples across 2 y, we found that the use of the germ-line library of V and J segments is conserved between individuals over time. Surprisingly, there appeared to be no correlation between the use level of a particular VJ combination and degree of expansion. We found the antibody RNA repertoire in each volunteer to be highly dynamic, with each individual displaying qualitatively different response dynamics. By using combinatorial phage display, we screened selected VH genes paired with their corresponding VL library for affinity against the vaccine antigens. Altogether, this work presents an additional set of tools for profiling the human antibody repertoire and demonstrates characterization of the fast repertoire dynamics through time in multiple individuals responding to an immune challenge.
Analyses of somatic hypermutation (SHM) patterns in B cell immunoglobulin (Ig) sequences contribute to our basic understanding of adaptive immunity, and have broad applications not only for ...understanding the immune response to pathogens, but also to determining the role of SHM in autoimmunity and B cell cancers. Although stochastic, SHM displays intrinsic biases that can confound statistical analysis, especially when combined with the particular codon usage and base composition in Ig sequences. Analysis of B cell clonal expansion, diversification, and selection processes thus critically depends on an accurate background model for SHM micro-sequence targeting (i.e., hot/cold-spots) and nucleotide substitution. Existing models are based on small numbers of sequences/mutations, in part because they depend on data from non-coding regions or non-functional sequences to remove the confounding influences of selection. Here, we combine high-throughput Ig sequencing with new computational analysis methods to produce improved models of SHM targeting and substitution that are based only on synonymous mutations, and are thus independent of selection. The resulting "S5F" models are based on 806,860 Synonymous mutations in 5-mer motifs from 1,145,182 Functional sequences and account for dependencies on the adjacent four nucleotides (two bases upstream and downstream of the mutation). The estimated profiles can explain almost half of the variance in observed mutation patterns, and clearly show that both mutation targeting and substitution are significantly influenced by neighboring bases. While mutability and substitution profiles were highly conserved across individuals, the variability across motifs was found to be much larger than previously estimated. The model and method source code are made available at http://clip.med.yale.edu/SHM.
The partial success of tumor immunotherapy induced by checkpoint blockade, which is not antigen-specific, suggests that the immune system of some patients contain antigen receptors able to ...specifically identify tumor cells. Here we focused on T-cell receptor (TCR) repertoires associated with spontaneous breast cancer. We studied the alpha and beta chain CDR3 domains of TCR repertoires of CD4 T cells using deep sequencing of cell populations in mice and applied the results to published TCR sequence data obtained from human patients. We screened peripheral blood T cells obtained monthly from individual mice spontaneously developing breast tumors by 5 months. We then looked at identical TCR sequences in published human studies; we used TCGA data from tumors and healthy tissues of 1,256 breast cancer resections and from 4 focused studies including sequences from tumors, lymph nodes, blood and healthy tissues, and from single cell dataset of 3 breast cancer subjects. We now report that mice spontaneously developing breast cancer manifest shared, Public CDR3 regions in both their alpha and beta and that a significant number of women with early breast cancer manifest identical CDR3 sequences. These findings suggest that the development of breast cancer is associated, across species, with biomarker, exclusive TCR repertoires.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK