How somatic mutations accumulate in normal cells is poorly understood. A comprehensive analysis of RNA sequencing data from ~6700 samples across 29 normal tissues revealed multiple somatic variants, ...demonstrating that macroscopic clones can be found in many normal tissues. We found that sun-exposed skin, esophagus, and lung have a higher mutation burden than other tested tissues, which suggests that environmental factors can promote somatic mosaicism. Mutation burden was associated with both age and tissue-specific cell proliferation rate, highlighting that mutations accumulate over both time and number of cell divisions. Finally, normal tissues were found to harbor mutations in known cancer genes and hotspots. This study provides a broad view of macroscopic clonal expansion in human tissues, thus serving as a foundation for associating clonal expansion with environmental factors, aging, and risk of disease.
Mutational processes constantly shape the somatic genome, leading to immunity, aging, cancer, and other diseases. When cancer is the outcome, we are afforded a glimpse into these processes by the ...clonal expansion of the malignant cell. Here, we characterize a less explored layer of the mutational landscape of cancer: mutational asymmetries between the two DNA strands. Analyzing whole-genome sequences of 590 tumors from 14 different cancer types, we reveal widespread asymmetries across mutagenic processes, with transcriptional (“T-class”) asymmetry dominating UV-, smoking-, and liver-cancer-associated mutations and replicative (“R-class”) asymmetry dominating POLE-, APOBEC-, and MSI-associated mutations. We report a striking phenomenon of transcription-coupled damage (TCD) on the non-transcribed DNA strand and provide evidence that APOBEC mutagenesis occurs on the lagging-strand template during DNA replication. As more genomes are sequenced, studying and classifying their asymmetries will illuminate the underlying biological mechanisms of DNA damage and repair.
Display omitted
•Replicative and transcriptional mutational asymmetries are widespread across cancer•APOBEC mutagenesis in humans primarily occurs on the lagging-strand template•Mismatch repair balances asymmetric replication errors•Transcription-coupled damage (TCD) introduces sense-strand mutations in liver cancer
Using an approach that distinguishes whether mutations in cancer genomes occurred on the transcribed or non-transcribed DNA strand with respect to transcription and on the leading or lagging strand with respect to replication, the predominant mutational mechanisms associated with different types of cancers and mutational patterns can be inferred.
There is a striking and unexplained male predominance across many cancer types. A subset of X-chromosome genes can escape X-inactivation, which would protect females from complete functional loss by ...a single mutation. To identify putative 'escape from X-inactivation tumor-suppressor' (EXITS) genes, we examined somatic alterations from >4,100 cancers across 21 tumor types for sex bias. Six of 783 non-pseudoautosomal region (PAR) X-chromosome genes (ATRX, CNKSR2, DDX3X, KDM5C, KDM6A, and MAGEC3) harbored loss-of-function mutations more frequently in males (based on a false discovery rate < 0.1), in comparison to zero of 18,055 autosomal and PAR genes (Fisher's exact P < 0.0001). Male-biased mutations in genes that escape X-inactivation were observed in combined analysis across many cancers and in several individual tumor types, suggesting a generalized phenomenon. We conclude that biallelic expression of EXITS genes in females explains a portion of the reduced cancer incidence in females as compared to males across a variety of tumor types.
Microsatellites (MSs) are tracts of variable-length repeats of short DNA motifs that exhibit high rates of mutation in the form of insertions or deletions (indels) of the repeated motif. Despite ...their prevalence, the contribution of somatic MS indels to cancer has been largely unexplored, owing to difficulties in detecting them in short-read sequencing data. Here we present two tools: MSMuTect, for accurate detection of somatic MS indels, and MSMutSig, for identification of genes containing MS indels at a higher frequency than expected by chance. Applying MSMuTect to whole-exome data from 6,747 human tumors representing 20 tumor types, we identified >1,000 previously undescribed MS indels in cancer genes. Additionally, we demonstrate that the number and pattern of MS indels can accurately distinguish microsatellite-stable tumors from tumors with microsatellite instability, thus potentially improving classification of clinically relevant subgroups. Finally, we identified seven MS indel driver hotspots: four in known cancer genes (ACVR2A, RNF43, JAK1, and MSH3) and three in genes not previously implicated as cancer drivers (ESRP1, PRDM2, and DOCK3).
Diffuse large B cell lymphoma (DLBCL), the most common lymphoid malignancy in adults, is a clinically and genetically heterogeneous disease that is further classified into transcriptionally defined ...activated B cell (ABC) and germinal center B cell (GCB) subtypes. We carried out a comprehensive genetic analysis of 304 primary DLBCLs and identified low-frequency alterations, captured recurrent mutations, somatic copy number alterations, and structural variants, and defined coordinate signatures in patients with available outcome data. We integrated these genetic drivers using consensus clustering and identified five robust DLBCL subsets, including a previously unrecognized group of low-risk ABC-DLBCLs of extrafollicular/marginal zone origin; two distinct subsets of GCB-DLBCLs with different outcomes and targetable alterations; and an ABC/GCB-independent group with biallelic inactivation of TP53, CDKN2A loss, and associated genomic instability. The genetic features of the newly characterized subsets, their mutational signatures, and the temporal ordering of identified alterations provide new insights into DLBCL pathogenesis. The coordinate genetic signatures also predict outcome independent of the clinical International Prognostic Index and suggest new combination treatment strategies. More broadly, our results provide a roadmap for an actionable DLBCL classification.
Which genetic alterations drive tumorigenesis and how they evolve over the course of disease and therapy are central questions in cancer biology. Here we identify 44 recurrently mutated genes and 11 ...recurrent somatic copy number variations through whole-exome sequencing of 538 chronic lymphocytic leukaemia (CLL) and matched germline DNA samples, 278 of which were collected in a prospective clinical trial. These include previously unrecognized putative cancer drivers (RPS15, IKZF3), and collectively identify RNA processing and export, MYC activity, and MAPK signalling as central pathways involved in CLL. Clonality analysis of this large data set further enabled reconstruction of temporal relationships between driver events. Direct comparison between matched pre-treatment and relapse samples from 59 patients demonstrated highly frequent clonal evolution. Thus, large sequencing data sets of clinically informative samples enable the discovery of novel genes associated with cancer, the network of relationships between the driver events, and their impact on disease relapse and clinical outcome.
Passenger Hotspot Mutations in Cancer Hess, Julian M.; Bernards, Andre; Kim, Jaegil ...
Cancer cell,
09/2019, Letnik:
36, Številka:
3
Journal Article
Recenzirano
Odprti dostop
Current statistical models for assessing hotspot significance do not properly account for variation in site-specific mutability, thereby yielding many false-positives. We thus (i) detail a ...Log-normal-Poisson (LNP) background model that accounts for this variability in a manner consistent with models of mutagenesis; (ii) use it to show that passenger hotspots arise from all common mutational processes; and (iii) apply it to a ∼10,000-patient cohort to nominate driver hotspots with far fewer false-positives compared with conventional methods. Overall, we show that many cancer hotspot mutations recurring at the same genomic site across multiple tumors are actually passenger events, recurring at inherently mutable genomic sites under no positive selection.
Display omitted
•Many cancer hotspots are passengers, recurring at inherently mutable genomic sites•Known genomic covariates are insufficient to fully predict inherent mutability•Our LNP model accurately infers latent variability beyond what current covariates predict•Our LNP model identifies putative driver hotspots with far fewer false-positives
Somatic hotspot mutations found in tumors are generally considered evidence for selection and are used to nominate tumor drivers. Hess et al. show that many hotspots occur at inherently mutable sites without selection and develop a model that accounts for these passenger hotspots, which can more accurately nominate true driver mutations.
Display omitted
•Genes are often regulated when small RNAs pair to longer folded messenger RNAs.•Our simple model overcomes the bottleneck of computing mRNA unpairing free energies.•The average free ...energies of the four RNA bases span a wide range (≈2kBT).•We implement our free energy model in our BindOligoNet algorithm.•Improves mRNA splice site prediction; explains ribosome binding site base biases.
Expression of mRNA is often regulated by the binding of a small RNA (miRNA, snoRNA, siRNA). While the pairing contribution to the net free energy is well parameterized and can be computed in O(N) time, the cost of removing pre-existing mRNA secondary structure has not received sufficient attention. Conventional methods for computing the unfolding free energy of a target mRNA are costly, scaling like the cube of the number of target bases O(N3). Here we introduce a model to describe the unfolding costs of the binding site, which features surprisingly big differences in the free energy parameters for the four bases. The model is implemented in our O(N) algorithm, BindOligoNet. Donor splice site prediction is more accurate when using our calculation of spliceosomal U1-snRNA to mRNA net binding free energy. Our base-dependent free energies also correlate with efficient ribosome docking near the start codon.