All life on Earth is unified by its use of a shared set of component chemical compounds and reactions, providing a detailed model for universal biochemistry. However, this notion of universality is ...specific to known biochemistry and does not allow quantitative predictions about examples not yet observed. Here, we introduce a more generalizable concept of biochemical universality that is more akin to the kind of universality found in physics. Using annotated genomic datasets including an ensemble of 11,955 metagenomes, 1,282 archaea, 11,759 bacteria, and 200 eukaryotic taxa, we show how enzyme functions form universality classes with common scaling behavior in their relative abundances across the datasets. We verify that these scaling laws are not explained by the presence of compounds, reactions, and enzyme functions shared across known examples of life. We demonstrate how these scaling laws can be used as a tool for inferring properties of ancient life by comparing their predictions with a consensus model for the last universal common ancestor (LUCA). We also illustrate how network analyses shed light on the functional principles underlying the observed scaling behaviors. Together, our results establish the existence of a new kind of biochemical universality, independent of the details of life on Earth's component chemistry, with implications for guiding our search for missing biochemical diversity on Earth or for biochemistries that might deviate from the exact chemical makeup of life as we know it, such as at the origins of life, in alien environments, or in the design of synthetic life.
Acquisition of a hyperdiploid (HY) karyotype or immunoglobulin heavy chain (IGH) translocations are considered key initiating events in multiple myeloma (MM). To explore if other genomic events can ...precede these events, we analyzed whole-genome sequencing (WGS) data from 1173 MM samples. Integrating molecular time and structural variants (SV) within early chromosomal duplications, we indeed identified pre-gain deletions in 9.4% of HY patients without IGH translocations, challenging HY as the earliest somatic event. Remarkably, these deletions affected tumor suppressor genes (TSG) and/or oncogenes in 2.4% of HY patients without IGH translocations, supporting their role in MM pathogenesis. Furthermore, our study points to post-gain deletions as novel driver mechanisms in MM. Using multi-omics approaches to investigate their biological impact, we found associations with poor clinical outcome in newly diagnosed patients and profound effects on both oncogene and TSG activity, despite the diploid gene status. Overall, this study provides novel insights into the temporal dynamics of genomic alterations in MM.
Current knowledge of the biological role of the non-hemopoietic components of the bone marrow (BM) niche to the pathogenesis of multiple myeloma (MM) is poorly understood. We utilized a murine model ...to interrogate niche forming cells early in the natural history of MM, at a precursor stage (MGUS/smoldering-like) in more detail. Studying interactions between the MM clone and the stromal components of the MM niche are challenging as they represent only approximately 0.2% of the total marrow content and are difficult to isolate. However, by isolating the stromal compartment of the BM by depleting hemopoietic cells via flow sorting combined with the use of single cell RNA sequencing (scRNA-seq) it is possible to perform an in-vivo analysis of the individual cellular components of the stromal microenvironment and to characterize their relative abundance, cellular differentiation state, and transcriptomic profile. To comprehensively assess changes in the stromal cells during the early stage of MM, 5TGM1 cells were intravenously injected into aged (6 month old) KaLwRij mice. KaLwRij mice without MM were used as controls. Stromal cells were isolated following tumor engraftment in the marrow at a precursor stage and enriched as previously described (Baryawno et al, Cell 2019). scRNA-seq was performed on 5 libraries of cells using the 10X Genomics Chromium platform. All data analysis was performed in R. After filtering out low quality cells, 45,030 cells remained for analysis. The 5 libraries were integrated with Seurat and run through a standard clustering workflow involving log-normalization, finding variable features, dimensional reduction, and clustering. Cell type annotation for each cluster was performed manually using established markers for both the stromal and immune compartments. After excluding immune cell populations, 14,219 cells remained. Select subpopulations were re-clustered by isolating the cell types of interest and running them through separate standard clustering workflows. For each population of interest, differentially expressed genes were determined using the Wilcox rank sum test in Seurat, trajectory analysis and pseudotime were computed using Monocle3, and enriched gene sets were computed using GSEAPreranked. We identified 7 distinct populations of stromal cells including mesenchymal stromal cells (MSCs) ( Lepr, Adipoq, and Cxcl12), osteo-lineage cells (OLCs) ( Bglap, Spp1, and Sp7), fibroblasts ( S100a4, Fn1, and Dcn), chondrocytes ( Col2a1, Sox9, and Acan), pericytes ( Acta2, Myh11, and Mcam), and two endothelial cell (ECs) populations, arterial (AEC) and sinusoidal (SEC) ( Cdh5, Cd34, and Pecam1). Compared to normal BM, MM engraftment was associated with numerical differences in stromal cell populations between normal and MM. Gene set enrichment analysis showed an inflammatory and oxidative stress signal associated with the MM microenvironment. Sub-clustering analysis showed MSC differentiation was polarized away from osteocyte formation towards adipocytes with the identification of a novel population only seen in MM. Bone marrow endothelial cell populations were also substantially altered at this early disease stage, with differentiation polarized towards sinusoidal endothelial cells generating a pro-angiogenic /pro-inflammatory phenotype. An increase in cells undergoing endothelial to mesenchymal transition (EndMT) was also present. Taken together, we show, for the first time, existence of remodeling of the stromal populations induced by the MM clone characterized by a pro-inflammatory phenotype together with polarized differentiation. These changes result in the expansion of a number of key populations that increase the MM niche contributing to growth and survival signals and shaping of the content of the immune microenvironment. These changes result in a self-perpetuating signaling loop between cells, which needs to be broken therapeutically in order to stop progression, induce remission and long-term disease outcomes. Conclusion - In early stages of MM pathogenesis, MM cells remodel the stromal microenvironment by altering the amount and function of MSCs and endothelial cells. Through favoring an adipocytic fate of MSCs, endothelial mesenchymal transition and altering the balance between arterial and sinusoidal endothelial cells, MM cells promote an inflammatory environment that contributes to MM development and progression.
INTRODUCTION: Little is known about the pattern and function of mutations within the 98% of the genome which is non-coding (nc). Whole-genome sequencing (WGS) can identify the full range of single ...nucleotide variants (SNVs), insertions/deletions (InDels), copy-number variants (CNVs), and structural variants (SVs), which are critical to disease progression. Here, we characterize the non-coding genome to gain significant insight into the role of mutations in gene regulatory elements in the etiology of multiple myeloma (MM) and to models of how it develops. METHODS: We studied 302 of MM precursor and newly diagnosed MM (NDMM) patients with high-coverage WGS, where each SNV/InDel was confirmed by two or more algorithms. Results were validated on an independent cohort of 256 NDMM with 80X WGS data. A pipeline employing a consensus mechanism for determining the final set of somatic events was used, including Mutect2, Strelka2, and VarScan2 for SNVs; Mutect2, Strelka2, VarScan2, and SvABA for InDels; Battenberg and FACETS for CNVs; Manta, SvABA, DELLY2, and IgCaller for SVs (https://github.com/pblaney/mgp1000). The R package fishHook was used toidentify statistically significant enrichment of mutations. To identify nc-variants we partitioned the genome into 10 kb tiles that were iteratively shifted by 500 bp and tested each tile against a regression model built into fishHook. The model includes a series of covariates that inform replication timing, sequencing context, and chromatin states. RESULTS: We identified 2,039,841 SNVs and 492,746 InDels in total. The tumor mutational burden (TMB) varies between molecular subgroups with the t(4;14) being significantly higher at 3.23 (somatic mutations per Mb) in comparison to the t(11;14) at 2.57 (FDR adj. P=0.035), which was closer to patients without a subtype translocation at 2.78. For ncSNVs and ncInDels, we identified 4,374 and 272 tiles respectively with significant mutation enrichment genome-wide (FDR adj. P<0.05). As tiles may overlap, we collapsed contiguous segments into consensus regions assigning the nearest coding gene as an identifier and termed these “mutation-enriched regions” or MERs. We identified 282 MERs associated with 203 genes for ncSNVs and 26 MERs associated with 25 genes for ncInDels. The two types of regions overlap at six loci ( TENT5C, OR2T2, FOXD4L1, BCL6, BLOC1SS- TXNDC5, PLD5P1). Thus, we identified 302 MERs associated with 221 genes, with some of the most highly mutated MERs included BCL6 (76.2% of patients), BLOC1S5- TXNDC5 (28.1%), ZFP36L1 (22.2%), BTG2 (21.2%), IRF8 (16.2%), TENT5C (13.6%), and CCND1 (12.3%). In total 19,743 of the 2,532,587 mutations fall into MERs with 1.3-65.6% of patients having one of these mutations. We evaluated the MERs for functional relevance by intersecting the regions with a list of 8,357 genome-wide enhancer (E) and super-enhancer (SE) elements derived from germinal-center B cells (GCB), DLBCL (Bal et al. Nature 2022) and MM (Lovén et al. Cell 2013). In total, 17.9% (54/302) of the MERs were identified, involving 45 genes. These MERs intersected with 28 Es and 20 SEs, with a non-random distribution of mutations within them. Of the total MER mutations, 21.9% (4,317/19,743) fell into some form of enhancer element. Breaking these down further 41.6% (1,798/4,317) are in Es, and 58.4% (2,519/4,317) are in SEs. All the E mutations were MM specific; of the SE mutations, 6.0% (18/302) of patients had a mutation in an ABC-DLBCL specific SE, 16.6% (50/302) in a GCB specific SE and 64.2% (194/302) in a MM specific SE. We examined the distribution of mutations within the SE regions and found they are non-random suggesting a selective mechanism. We intersected the SE regions with SV and found an excess at TENT5C, BTG2, BLOC1S5-TXNDC5 and ZFP36L1. A focused analysis of chr1p, chr1q, chr6q and chr14 revealed the importance of mutationally induced breaks within the SE and its translocation to a receptor site often 8q the site of MYC. CONCLUSIONS: We provide evidence for an important contribution of mutations within E and SE regions to the etiology of MM. This may involve either direct selection of mutations within the GC or by the re-entry of a memory B-cell carrying a pattern of mutations it acquired in a pre-MM phase, which then acquires a MM-specific driver. FIGURE: Distribution of mutations across MM genomes. A) Tumor mutational burden across MM subtypes. B) Q-Q plots of fishHook model for SNVs
INTRODUCTION: Improving health outcomes for patients with African ancestry (AA) is a key healthcare aim, but there is uncertainty in whether disparities arise predominantly from socioeconomic ...differences or from genetic differences in tumor biology. It remains an open question as to whether multiple myeloma (MM) occurring in AA patients has a similar or different spectrum of genomic abberations when compared to patients having European ancestry. To date, studies have suggested that AA have an excess of t(11;14) and a deficiency of TP53 mutations. We have established a series of 302 cases of MM precursor and newly diagnosed MM with whole-genome sequencing (WGS) available that is enriched for self-declared AA or diverse ethnic background. METHODS: In collaboration with the NYGC and Polyethnic-1000 consortium, we sequenced tumor samples to a depth of 60-80X and normal tissue to 30-40X. We a employed bioinformatics pipeline with a consensus mechanism for somatic variant calling, including Mutect2, Strelka2, and VarScan2 for SNVs; Mutect2, Strelka2, VarScan2, and SvABA for InDels; Battenberg and FACETS for CNVs; Manta, SvABA, DELLY2, and IgCaller for SVs. Additionally, an admixture workflow was used to estimate each individual's ancestral lineage using continentally-distinct references, comprising 23 regional populations within 5 super-populations from the 1000 Genomes Project (https://github.com/pblaney/mgp1000). Mutational signature were calculated using the R package mmsig (Rustad et al. Comm. Bio. 2021). RESULTS: Using admixture estimations from 302 patients with high-coverage WGS together with 941 patients from the CoMMpass trial, we identified five clusters corresponding to single dominant genetic ancestries (median proportion >75% assignment to reference super-population), together with a cluster characterized by highly admixed individuals with no dominant genetic ancestry (median proportion <50%). Of the total, 53.0% are in the European dominant (EUR) cluster, 26.5% African dominant (AFR), 8.6% American dominant (AMR), 7.9% highly admixed, 2.0% East Asian dominant (EAS) and South-East Asian dominant (SAS) clusters, respectively. Stratifying patients by their cluster assignment and calculating the frequency of subtype translocations, we show that t(11;14) occurred in 25.8% of EUR patients while in only 14.6% of AFR (p=0.045) and 7.7% of AMR (p=0.05) patients. The frequency of the t(4;14) was more closely distributed, with 15.7% in EUR, 11.5% in AMR, and 11% in AFR clusters. For the acquired somatic mutations, the tumor mutational burden (TMB) was lowest in the AFR cluster 2.21 (median, somatic mutations per Mb), significantly lower by comparison to the EUR cluster at 2.94 (FDR adj. P=4.3x10 -6). The most striking genomic difference was observed when comparing the mutational signatures landscape between AA and the other racial groups. Using WGS, AA had lower SBS1 and SBS5 absolute contribution compared to EUR, and this was largely responsible for the difference in TMB. SBS1 and SBS5 are known to be clock-like signatures, accumulating at a constant rate over time. Because no differences in age, cancer cell fraction, and coverage were observed between AA and EUR, this finding suggest different mutational clock-like rate between AA and EUR. The higher TMB observed in EUR was also driven by the higher APOBEC-mutational activity (SBS2 and SBS13) compared to AA (p=0.05). Interestingly, 72% of all EUR had APOBEC-activity evident, in contrast to 45% of the AA (p=0.001). This difference was confirmed after excluding the MM precursor patients, previously demonstrated to have lower APOBEC-activity (Oben et al. Natur Comm. 2021), and was validated on CoMMpass whole-exomes. CONCLUSIONS: Leveraging one of the largest series of diverse patients with WGS, and integrating genomic data with comprehensive ancestry information, AA MM emerged as biologically different in term of genomic drivers and mutational signatures, suggesting potential differences in etiology and genomic evolution over time. Further analysis will include molecular timing of clonal copy number gains, and reconstruction of phylogenetic trees, with view to improving our understanding of the etiology of MM development across patient genetic backgrounds. FIGURE: Genomic characteristics of myeloma across ancestries. A) Translocation percentage (number per cluster) and B) Tumor mutational burden across clusters.
INTRODUCTION: Multiple myeloma (MM) evolution is complex and heterogeneous. Hyperdiploidy (HRD) and translocations affecting the immunoglobulin heavy chain (IGH) locus are historically considered ...initiating genomic events of MM. The potential impact of genomic events acquired before known MM initiating events has never been addressed. METHODS: To investigate whether genomic deletions are acquired before and after known MM initiating events we interrogated whole-genome sequencing (WGS) data from patients with newly diagnosed MM (NDMM, n=319) and relapsed MM (RRMM, n=59). Our newly developed analytical and chronological workflow integrating single nucleotide variants (SNV), structural variants (SV), and copy number variants (CNV) comprises two main steps. First, we estimated the molecular time (i.e. corrected ratio between duplicated and non-duplicated clonal SNV) of large clonal chromosomal duplications and identified the earliest set of chromosomal gains in each patient. Second, within each early large chromosomal gain we identified clonal SV mediating CNV loss. A deletion on a gain can generate three possible scenarios: 1) one of the duplicated alleles is lost after the gain (i.e. post-gain) causing a CNV jump from 3:1 to 2:1 (total alleles : minor alleles); 2) there is a deletion before the duplication (i.e., pre-gain) causing a CNV jump from 3:1 to 1:0. 3) the deletion occurs on the minor, non-duplicated allele, causing a CNV jump from 3:1 to 2:0. Timing the deletion in relation with the chromosomal duplication is impossible in this scenario. RESULTS: Molecular time data were successfully generated for 249/319 (78%) NDMM and 51/59 (86%) RRMM patients. Restricting our analysis to NDMM with HRD without canonical IGH translocations, 16/170 (9%) of patients acquired deletions before the earliest multi-chromosomal gains, suggesting MM precursors can acquire deletions before HRD. Investigating the entire series, post-gain deletions were observed in 126/417 (30%) samples considering both early and late time windows. Leveraging a background model, we demonstrated that pre-gain events involved more tumor-suppressor genes (TSG) than expected by chance, including TCF3, ATM and TRAF3. In contrast, oncogenes were involved less than expected. In post-gain deletions, both oncogenes and TSG were involved more than expected by chance . To validate and assess the impact of deletions on driver gene expression we investigated WGS and RNAsequencing data from the MMRF CoMMpass study (n=752 NDMM). Because of the low coverage not allowing for molecular time estimation, we limited our analysis to HRD without IGH translocations. Pre-gain and post-gain deletions were observed in 47/431 (11%) and 225/431 (52%) of patients, respectively. We defined loss and gain of function events based on whether the expression level of a gene affected by the deletion was in the first or fourth quartile, respectively. Pre-gain deletions were mostly associated with downregulation of TSG expression n=31 events in 13/431 (3%) patients. In contrast,post-gain deletions had a more heterogenous impact with loss- and gain-of-function events. Gain-of-function events were driven by two main mechanisms: 1) the deletion joined an oncogene with a distal regulatory region inducing its overexpression n=44 events in 6/431 (1.3%) patients; 2) the deletion caused a new and expressed fusion n=231 events in 109/431 (25%) patients, resulting in either loss- or gain-of-function. Surprisingly, post-gain deletions had also a major impact on TSG expression. In 60 patients (14%), we observed 146 post-gain deletions where the affected tumor suppressor gene (TSG) expression was downregulated to the level of cases with monoallelic and biallelic deletions, even though two alleles were retained. Finally, to validate these findings, we investigated 16 RRMM patients with available WGS, scATAC-seq and scRNA-seq and observed additional evidence of TSG downregulation after both pre- and post-gain deletions. CONCLUSION: Leveraging a large cohort of NDMM we show that somatic deletions can be acquired before HRD trisomies that are assumed to be initiating events. Furthermore, post-gain deletions emerged as a new mechanism inducing TSG down-regulation, despite an apparently diploid gene status.
Chromosome 1 (chr1) copy number abnormalities (CNAs) and structural variants (SV) are frequent in newly diagnosed multiple myeloma (NDMM) and associate with a heterogeneous impact on outcome the ...drivers of which are largely unknown.
A multiomic approach comprising CRISPR, gene mapping of CNA and SV, methylation, expression, and mutational analysis was used to document the extent of chr1 molecular variants and their impact on pathway utilisation.
We identified two distinct groups of gain(1q): focal gains associated with limited gene expression changes and a neutral prognosis, and whole-arm gains, which associate with substantial gene expression changes, complex genetics and an adverse prognosis. CRISPR identified a number of dependencies on chr1 but only limited variants associated with acquired CNAs. We identified seven regions of deletion, nine of gain, three of chromothripsis (CT) and two of templated-insertion (TI), which contain a number of potential drivers. An additional mechanism involving hypomethylation of genes at 1q may contribute to the aberrant gene expression of a number of genes. Expression changes associated with whole-arm gains were substantial and gene set enrichment analysis identified metabolic processes, apoptotic resistance, signaling via the MAPK pathway, and upregulation of transcription factors as being key drivers of the adverse prognosis associated with these variants.
Multiple layers of genetic complexity impact the phenotype associated with CNAs on chr1 to generate its associated clinical phenotype. Whole-arm gains of 1q are the critically important prognostic group that deregulate multiple pathways, which may offer therapeutic vulnerabilities.