The c-Myc HLH-bZIP protein has been implicated in physiological or pathological growth, proliferation, apoptosis, metabolism, and differentiation at the cellular, tissue, or organismal levels via ...regulation of numerous target genes. No principle yet unifies Myc action due partly to an incomplete inventory and functional accounting of Myc’s targets. To observe Myc target expression and function in a system where Myc is temporally and physiologically regulated, the transcriptomes and the genome-wide distributions of Myc, RNA polymerase II, and chromatin modifications were compared during lymphocyte activation and in ES cells as well. A remarkably simple rule emerged from this quantitative analysis: Myc is not an on-off specifier of gene activity, but is a nonlinear amplifier of expression, acting universally at active genes, except for immediate early genes that are strongly induced before Myc. This rule of Myc action explains the vast majority of Myc biology observed in literature.
The critical initial step in V(D)J recombination, binding of RAG1 and RAG2 to recombination signal sequences flanking antigen receptor V, D, and J gene segments, has not previously been characterized ...in vivo. Here, we demonstrate that RAG protein binding occurs in a highly focal manner to a small region of active chromatin encompassing Igκ and Tcrα J gene segments and Igh and Tcrβ J and J-proximal D gene segments. Formation of these small RAG-bound regions, which we refer to as recombination centers, occurs in a developmental stage- and lineage-specific manner. Each RAG protein is independently capable of specific binding within recombination centers. While RAG1 binding was detected only at regions containing recombination signal sequences, RAG2 binds at thousands of sites in the genome containing histone 3 trimethylated at lysine 4. We propose that recombination centers coordinate V(D)J recombination by providing discrete sites within which gene segments are captured for recombination.
Display omitted
► RAG1 and RAG2 bind focally to J gene segments in the Igh, Igκ, Tcrα, and Tcrβ loci ► Each RAG protein is independently capable of specific binding to chromatin ► RAG1 binding depends on recognition of recombination signal sequences ► RAG2 binds throughout the genome at sites of histone 3 trimethylated at lysine 4
DNA double-strand breaks (DSBs) represent a threat to the genome because they can lead to the loss of genetic information and chromosome rearrangements. The DNA repair protein p53 binding protein 1 ...(53BP1) protects the genome by limiting nucleolytic processing of DSBs by a mechanism that requires its phosphorylation, but whether 53BP1 does so directly is not known. Here, we identify Rap1-interacting factor 1 (Rif1) as an ATM (ataxia-telangiectasia mutated) phosphorylation-dependent interactor of 53BP1 and show that absence of Rif1 results in 5′-3′ DNA-end resection in mice. Consistent with enhanced DNA resection, Rif1 deficiency impairs DNA repair in the G 1 and S phases of the cell cycle, interferes with class switch recombination in B lymphocytes, and leads to accumulation of chromosome DSBs.
The cytidine deaminase AID hypermutates immunoglobulin genes but can also target oncogenes, leading to tumorigenesis. The extent of AID's promiscuity and its predilection for immunoglobulin genes are ...unknown. We report here that AID interacted broadly with promoter-proximal sequences associated with stalled polymerases and chromatin-activating marks. In contrast, genomic occupancy of replication protein A (RPA), an AID cofactor, was restricted to immunoglobulin genes. The recruitment of RPA to the immunoglobulin loci was facilitated by phosphorylation of AID at Ser38 and Thr140. We propose that stalled polymerases recruit AID, thereby resulting in low frequencies of hypermutation across the B cell genome. Efficient hypermutation and switch recombination required AID phosphorylation and correlated with recruitment of RPA. Our findings provide a rationale for the oncogenic role of AID in B cell malignancy.
Chromosomal rearrangements, including translocations, require formation and joining of DNA double strand breaks (DSBs). These events disrupt the integrity of the genome and are frequently involved in ...producing leukemias, lymphomas and sarcomas. Despite the importance of these events, current understanding of their genesis is limited. To examine the origins of chromosomal rearrangements we developed Translocation Capture Sequencing (TC-Seq), a method to document chromosomal rearrangements genome-wide, in primary cells. We examined over 180,000 rearrangements obtained from 400 million B lymphocytes, revealing that proximity between DSBs, transcriptional activity and chromosome territories are key determinants of genome rearrangement. Specifically, rearrangements tend to occur in
cis and to transcribed genes. Finally, we find that activation-induced cytidine deaminase (AID) induces the rearrangement of many genes found as translocation partners in mature B cell lymphoma.
Display omitted
► A new genome-wide mapping method identifies translocations in primary cells ► Transcription favors chromosome rearrangement ► Rearrangements define chromosome territories in B cells ► AID-mediated translocations are found in many genes, including protooncogenes
Identification of chromosomal rearrangements on a genome-wide scale highlights the relative contributions of 3D-chromosomal organization, active transcription, and AID-activity to oncogenic translocations.
Although the cellular concentration of miRNAs is critical to their function, how miRNA expression and abundance are regulated during ontogeny is unclear. We applied miRNA-, mRNA-, and ChIP-Seq to ...characterize the microRNome during lymphopoiesis within the context of the transcriptome and epigenome. We show that lymphocyte-specific miRNAs are either tightly controlled by polycomb group-mediated H3K27me3 or maintained in a semi-activated epigenetic state prior to full expression. Because of miRNA biogenesis, the cellular concentration of mature miRNAs does not typically reflect transcriptional changes. However, we uncover a subset of miRNAs for which abundance is dictated by miRNA gene expression. We confirm that concentration of 5p and 3p miRNA strands depends largely on free energy properties of miRNA duplexes. Unexpectedly, we also find that miRNA strand accumulation can be developmentally regulated. Our data provide a comprehensive map of immunity's microRNome and reveal the underlying epigenetic and transcriptional forces that shape miRNA homeostasis.
Display omitted
► H3K27me3 inhibits expression of induced miRNAs during lymphopoiesis ► Lymphocyte-specific, poised miRNAs are not downregulated by H3K27me3 ► Fluctuations in 25% of all miRNAs are dictated by transcription in B cells ► miRNA strand accumulation can be developmentally regulated
The “CTCF code” hypothesis posits that CTCF pleiotropic functions are driven by recognition of diverse sequences through combinatorial use of its 11 zinc fingers (ZFs). This model, however, is ...supported by in vitro binding studies of a limited number of sequences. To study CTCF multivalency in vivo, we define ZF binding requirements at ∼50,000 genomic sites in primary lymphocytes. We find that CTCF reads sequence diversity through ZF clustering. ZFs 4–7 anchor CTCF to ∼80% of targets containing the core motif. Nonconserved flanking sequences are recognized by ZFs 1–2 and ZFs 8–11 clusters, which also stabilize CTCF broadly. Alternatively, ZFs 9–11 associate with a second phylogenetically conserved upstream motif at ∼15% of its sites. Individually, ZFs increase overall binding and chromatin residence time. Unexpectedly, we also uncovered a conserved downstream DNA motif that destabilizes CTCF occupancy. Thus, CTCF associates with a wide array of DNA modules via combinatorial clustering of its 11 ZFs.
Display omitted
•Genome-wide maps of 11 CTCF zinc finger mutants in B lymphocytes•Zinc finger mutations differentially affect CTCF binding and nuclear mobility•CTCF uses zinc finger clusters to recognize DNA sequence diversity•DNA sequences flanking the core motif modulate CTCF binding
CTCF is a nuclear architectural protein that binds to thousands of highly diverse sequences in eukaryotes. The current hypothesis, known as the “CTCF code,” proposes that CTCF binds DNA targets through combinatorial use of its 11 zinc fingers (ZFs). This model, however, is mostly supported by in vitro binding studies. By expressing ZF mutants in B lymphocytes, Resch, Casellas, and colleagues now present genome-wide maps of CTCF multivalency. They show that CTCF reads sequence diversity by relying on well-defined ZF clusters.
Lymphocyte activation is initiated by a global increase in messenger RNA synthesis. However, the mechanisms driving transcriptome amplification during the immune response are unknown. By monitoring ...single-stranded DNA genome wide, we show that the genome of naive cells is poised for rapid activation. In G0, ∼90% of promoters from genes to be expressed in cycling lymphocytes are polymerase loaded but unmelted and support only basal transcription. Furthermore, the transition from abortive to productive elongation is kinetically limiting, causing polymerases to accumulate nearer to transcription start sites. Resting lymphocytes also limit the expression of the transcription factor IIH complex, including XPB and XPD helicases involved in promoter melting and open complex extension. To date, two rate-limiting steps have been shown to control global gene expression in eukaryotes: preinitiation complex assembly and polymerase pausing. Our studies identify promoter melting as a third key regulatory step and propose that this mechanism ensures a prompt lymphocyte response to invading pathogens.
Display omitted
•Lymphocyte activation induces a proportional amplification of the transcriptome•ssDNA-seq detects promoter melting and non-B DNA in living cells•Promoters in G0 lymphocytes are PolII loaded but unmelted•TFIIH expression and activity are limited in G0 lymphocytes
In addition to preinitiation complex assembly and polymerase pausing, DNA melting can also regulate transcription in eukaryotes. In lymphocytes, immune promoters loaded with polymerase are restrained until the DNA is melted, possibly via changes in TFIIH levels.
Viruses with large, double-stranded DNA genomes captured the majority of their genes from their hosts at different stages of evolution. The origins of many virus genes are readily detected through ...significant sequence similarity with cellular homologs. In particular, this is the case for virus enzymes, such as DNA and RNA polymerases or nucleotide kinases, that retain their catalytic activity after capture by an ancestral virus. However, a large fraction of virus genes have no readily detectable cellular homologs, meaning that their origins remain enigmatic. We explored the potential origins of such proteins that are encoded in the genomes of orthopoxviruses, a thoroughly studied virus genus that includes major human pathogens. To this end, we used AlphaFold2 to predict the structures of all 214 proteins that are encoded by orthopoxviruses. Among the proteins of unknown provenance, structure prediction yielded clear indications of origin for 14 of them and validated several inferences that were previously made via sequence analysis. A notable emerging trend is the exaptation of enzymes from cellular organisms for nonenzymatic, structural roles in virus reproduction that is accompanied by the disruption of catalytic sites and by an overall drastic divergence that precludes homology detection at the sequence level. Among the 16 orthopoxvirus proteins that were found to be inactivated enzyme derivatives are the poxvirus replication processivity factor A20, which is an inactivated NAD-dependent DNA ligase; the major core protein A3, which is an inactivated deubiquitinase; F11, which is an inactivated prolyl hydroxylase; and more similar cases. For nearly one-third of the orthopoxvirus virion proteins, no significantly similar structures were identified, suggesting exaptation with subsequent major structural rearrangement that yielded unique protein folds.
Protein structures are more strongly conserved in evolution than are amino acid sequences. Comparative structural analysis is particularly important for inferring the origins of viral proteins that typically evolve at high rates. We used a powerful protein structure modeling method, namely, AlphaFold2, to model the structures of all orthopoxvirus proteins and compared them to all available protein structures. Multiple cases of recruitment of host enzymes for structural roles in viruses, accompanied by the disruption of catalytic sites, were discovered. However, many viral proteins appear to have evolved unique structural folds.
Rapid increases in DNA sequencing capabilities have led to a vast increase in the data generated from prokaryotic genomic studies, which has been a boon to scientists studying micro-organism ...evolution and to those who wish to understand the biological underpinnings of microbial systems. The NCBI Protein Clusters Database (ProtClustDB) has been created to efficiently maintain and keep the deluge of data up to date. ProtClustDB contains both curated and uncurated clusters of proteins grouped by sequence similarity. The May 2008 release contains a total of 285 386 clusters derived from over 1.7 million proteins encoded by 3806 nt sequences from the RefSeq collection of complete chromosomes and plasmids from four major groups: prokaryotes, bacteriophages and the mitochondrial and chloroplast organelles. There are 7180 clusters containing 376 513 proteins with curated gene and protein functional annotation. PubMed identifiers and external cross references are collected for all clusters and provide additional information resources. A suite of web tools is available to explore more detailed information, such as multiple alignments, phylogenetic trees and genomic neighborhoods. ProtClustDB provides an efficient method to aggregate gene and protein annotation for researchers and is available at http://www.ncbi.nlm.nih.gov/sites/entrez?db=proteinclusters.