Every T cell receptor (TCR) repertoire is shaped by a complex probabilistic tangle of genetically determined biases and immune exposures. T cells combine a random V(D)J recombination process with a ...selection process to generate highly diverse and functional TCRs. The extent to which an individual's genetic background is associated with their resulting TCR repertoire diversity has yet to be fully explored. Using a previously published repertoire sequencing dataset paired with high-resolution genome-wide genotyping from a large human cohort, we infer specific genetic loci associated with V(D)J recombination probabilities using genome-wide association inference. We show that V(D)J gene usage profiles are associated with variation in the
locus and, specifically for the functional TCR repertoire, variation in the major histocompatibility complex locus. Further, we identify specific variations in the genes encoding the Artemis protein and the TdT protein to be associated with biasing junctional nucleotide deletion and N-insertion, respectively. These results refine our understanding of genetically-determined TCR repertoire biases by confirming and extending previous studies on the genetic determinants of V(D)J gene usage and providing the first examples of
genetic variants which are associated with modifying junctional diversity. Together, these insights lay the groundwork for further explorations into how immune responses vary between individuals.
Peptidomimetics are classes of molecules that mimic structural and functional attributes of polypeptides. Peptidomimetic oligomers can frequently be synthesized using efficient solid phase synthesis ...procedures similar to peptide synthesis. Conformationally ordered peptidomimetic oligomers are finding broad applications for molecular recognition and for inhibiting protein-protein interactions. One critical limitation is the limited set of design tools for identifying oligomer sequences that can adopt desired conformations. Here, we present expansions to the ROSETTA platform that enable structure prediction and design of five non-peptidic oligomer scaffolds (noncanonical backbones), oligooxopiperazines, oligo-peptoids, Formula: see text-peptides, hydrogen bond surrogate helices and oligosaccharides. This work is complementary to prior additions to model noncanonical protein side chains in ROSETTA. The main purpose of our manuscript is to give a detailed description to current and future developers of how each of these noncanonical backbones was implemented. Furthermore, we provide a general outline for implementation of new backbone types not discussed here. To illustrate the utility of this approach, we describe the first tests of the ROSETTA molecular mechanics energy function in the context of oligooxopiperazines, using quantum mechanical calculations as comparison points, scanning through backbone and side chain torsion angles for a model peptidomimetic. Finally, as an example of a novel design application, we describe the automated design of an oligooxopiperazine that inhibits the p53-MDM2 protein-protein interaction. For the general biological and bioengineering community, several noncanonical backbones have been incorporated into web applications that allow users to freely and rapidly test the presented protocols (http://rosie.rosettacommons.org). This work helps address the peptidomimetic community's need for an automated and expandable modeling tool for noncanonical backbones.
TAL effectors are re-targetable transcription factors used for tailored gene regulation and, as TAL effector-nuclease fusions (TALENs), for genome engineering. Their hallmark feature is a ...customizable central string of polymorphic amino acid repeats that interact one-to-one with individual DNA bases to specify the target. Sequences targeted by TAL effector repeats in nature are nearly all directly preceded by a thymine (T) that is required for maximal activity, and target sites for custom TAL effector constructs have typically been selected with this constraint. Multiple crystal structures suggest that this requirement for T at base 0 is encoded by a tryptophan residue (W232) in a cryptic repeat N-terminal to the central repeats that exhibits energetically favorable van der Waals contacts with the T. We generated variants based on TAL effector PthXo1 with all single amino acid substitutions for W232. In a transcriptional activation assay, many substitutions altered or relaxed the specificity for T and a few were as active as wild type. Some showed higher activity. However, when replicated in a different TAL effector, the effects of the substitutions differed. Further, the effects differed when tested in the context of a TALEN in a DNA cleavage assay, and in a TAL effector-DNA binding assay. Substitution of the N-terminal region of the PthXo1 construct with that of one of the TAL effector-like proteins of Ralstonia solanacearum, which have arginine in place of the tryptophan, resulted in specificity for guanine as the 5' base but low activity, and several substitutions for the arginine, including tryptophan, destroyed activity altogether. Thus, the effects on specificity and activity generated by substitutions at the W232 (or equivalent) position are complex and context dependent. Generating TAL effector scaffolds with high activity that robustly accommodate sites without a T at position 0 may require larger scale re-engineering.
Adaptive immune recognition is mediated by antigen receptors on B and T cells generated by somatic recombination during lineage development. The high level of diversity resulting from this process ...posed technical limitations that previously limited the comprehensive analysis of adaptive immune recognition. Advances over the last ten years have produced data and approaches allowing insights into how T cells develop, evolutionary signatures of recombination and selection, and the features of T cell receptors that mediate epitope-specific binding and T cell activation. The size and complexity of these data have necessitated the generation of novel computational and analytical approaches, which are transforming how T cell immunology is conducted. Here we review the development and application of novel biological, theoretical, and computational methods for understanding T cell recognition and discuss the potential for improved models of receptor:antigen interactions.
Mutations affecting the spliceosomal protein U2AF1 are among the most common mutations observed in patients with MDS and related disorders. However, it is unclear how these mutations affect the ...normal RNA splicing process, and how the resulting changes in splicing contribute to myeloid dysplasia. Here, we combined the strengths of data from primary AML patient samples with the controlled context of isogenic cell lines. We generated K562 erythroleukemic cell lines stably expressing each of the four common U2AF1 mutations (S34F, S34Y, Q157P, and Q157R). We compared expression of each of these mutant alleles with knock down of endogenous U2AF1 to compare the relative consequences of U2AF1 mutations versus loss of function.
We first sought to identify changes in splicing driven by U2AF1 mutations that contribute to myeloid dysplasia. We compared the splicing of ~125,000 annotated alternative splicing events and ~160,000 constitutively spliced junctions between AML samples with or without mutations (TCGA cohort), as well as our isogenic K562 cell lines stably expressing either mutant (S34F, S34Y, Q157P, and Q157R) or wild-type (WT) alleles of U2AF1. Unsupervised cluster analysis revealed that S34F/Y versus Q157P/R samples clustered together in both the AML data and our cell lines, suggesting that U2AF1 mutations affecting different residues of the protein have different molecular consequences. Intersecting the AML and K562 data, we identified >300 splicing events that were consistently differentially spliced in association with S34 mutations, and a similar number for Q157 mutations. Many of these splicing events affected biological pathways that have been implicated in myeloid malignancies, including DNA methylation (DNMT3B), X chromosome inactivation (H2AFY), the DNA damage response (ATR, FANCA), and apoptosis (CASP8). For example, two exons of DNMT3B are differentially spliced in both AML samples and our K562 cells (Figure A), including an exon lying within the methyltransferase domain.
We next identified mechanistic changes in the splicing process caused by U2AF1 mutations. U2AF1 binds the intron-exon boundary by sequence-specifically recognizing the AG dinucleotide and flanking sequence positions that define the 3′ splice site. Comparing AML samples and K562 cells with and without U2AF1 mutations, we found that S34 and Q157 mutations give rise to specific and distinct alterations in 3′ splice site preference. S34 mutations alter the consensus nucleotide immediately before the AG dinucleotide, while Q157 mutations alter the consensus nucleotide immediately after the AG (Figure B). We observed highly similar allele-specific alterations in 3′ splice site preference in every AML patient with a U2AF1 mutation, as well as all K562 cell lines expressing a U2AF1 mutant allele. In contrast, knock down of endogenous U2AF1 caused no alterations in the consensus sequence at those positions, indicating that U2AF1 mutations do not cause loss of function at the level of RNA splicing.
To confirm that the nucleotides immediately before and after the AG determine whether a splice site responds to U2AF1 mutations, we created minigenes of cassette exons within the ATR and EPB49 genes. We found that response to U2AF1 S34 and Q157 mutations requires the endogenous nucleotides immediately before and after the AG, as predicted by our genomics analysis, and that mutating these positions abolishes response to U2AF1 mutations. Finally, we recapitulated the RNA splicing process in vitro using nuclear extract from blood cells expressing either wild-type or mutant U2AF1 to show that identical changes in splice site preference occur in a controlled in vitro context (Figure C).
Together, our data show that U2AF1 mutations cause allele-specific alterations in normal 3′ splice site recognition in patients, in isogenic cell lines, and in vitro. These alterations in splice site preference give rise to mis-splicing that affects many genes previously implicated in myeloid malignancies.
Display omitted
No relevant conflicts of interest to declare.
In the original version of this Article the colour key for the amino acid enrichment score was inadvertently omitted from the lower panel of Figure 5b during the production process. This has now been ...corrected in the PDF and HTML versions of the Article.
TAL (transcriptional activator‐like) effectors are DNA‐binding repeat proteins recently shown to recognize their target sites by an unprecedented, 1:1 mapping between repeat residues and DNA bases. ...The structural basis for this recognition is not known; in particular, it is not clear whether such 1:1 recognition can be accommodated by standard major‐groove readout of B‐form DNA. Here we describe a structure prediction protocol tailored to the TAL–DNA system, and report simulation results that shed light on observed repeat‐base associations and overall TAL structure. Our models demonstrate that TAL–DNA interactions can be explained by a model in which the TAL repeat domain forms a superhelical repeat structure that wraps around undistorted B‐form DNA, paralleling the geometry of the major groove, with contacts between position 13 of each repeat and its associated base pair on the sense strand determining the specificity of DNA recognition.
Abstract
Circular tandem repeat proteins (‘cTRPs’) are de novo designed protein scaffolds (in this and prior studies, based on antiparallel two-helix bundles) that contain repeated protein sequences ...and structural motifs and form closed circular structures. They can display significant stability and solubility, a wide range of sizes, and are useful as protein display particles for biotechnology applications. However, cTRPs also demonstrate inefficient self-assembly from smaller subunits. In this study, we describe a new generation of cTRPs, with longer repeats and increased interaction surfaces, which enhanced the self-assembly of two significantly different sizes of homotrimeric constructs. Finally, we demonstrated functionalization of these constructs with (1) a hexameric array of peptide-binding SH2 domains, and (2) a trimeric array of anti-SARS CoV-2 VHH domains. The latter proved capable of sub-nanomolar binding affinities towards the viral receptor binding domain and potent viral neutralization function.
Computational protein–protein docking methods currently can create models with atomic accuracy for protein complexes provided that the conformational changes upon association are restricted to the ...side chains. However, it remains very challenging to account for backbone conformational changes during docking, and most current methods inherently keep monomer backbones rigid for algorithmic simplicity and computational efficiency. Here we present a reformulation of the Rosetta docking method that incorporates explicit backbone flexibility in protein–protein docking. The new method is based on a “fold-tree” representation of the molecular system, which seamlessly integrates internal torsional degrees of freedom and rigid-body degrees of freedom. Problems with internal flexible regions ranging from one or more loops or hinge regions to all of one or both partners can be readily treated using appropriately constructed fold trees. The explicit treatment of backbone flexibility improves both sampling in the vicinity of the native docked conformation and the energetic discrimination between near-native and incorrect models.