Detection of somatic mutations in human leukocyte antigen (HLA) genes using whole-exome sequencing (WES) is hampered by the high polymorphism of the HLA loci, which prevents alignment of sequencing ...reads to the human reference genome. We describe a computational pipeline that enables accurate inference of germline alleles of class I HLA-A, B and C genes and subsequent detection of mutations in these genes using the inferred alleles as a reference. Analysis of WES data from 7,930 pairs of tumor and healthy tissue from the same patient revealed 298 nonsilent HLA mutations in tumors from 266 patients. These 298 mutations are enriched for likely functional mutations, including putative loss-of-function events. Recurrence of mutations suggested that these 'hotspot' sites were positively selected. Cancers with recurrent somatic HLA mutations were associated with upregulation of signatures of cytolytic activity characteristic of tumor infiltration by effector lymphocytes, supporting immune evasion by altered HLA function as a contributory mechanism in cancer.
Pituitary adenomas are the second most common primary brain tumor, yet their genetic profiles are incompletely understood.
We performed whole-exome sequencing of 42 pituitary macroadenomas and ...matched normal DNA. These adenomas included hormonally active and inactive tumors, ones with typical or atypical histology, and ones that were primary or recurrent.
We identified mutations, insertions/deletions, and copy-number alterations. Nearly one-third of samples (29%) had chromosome arm-level copy-number alterations across large fractions of the genome. Despite such widespread genomic disruption, these tumors had few focal events, which is unusual among highly disrupted cancers. The other 71% of tumors formed a distinct molecular class, with somatic copy number alterations involving less than 6% of the genome. Among the highly disrupted group, 75% were functional adenomas or atypical null-cell adenomas, whereas 87% of the less-disrupted group were nonfunctional adenomas. We confirmed this association between functional subtype and disruption in a validation dataset of 87 pituitary adenomas. Analysis of previously published expression data from an additional 50 adenomas showed that arm-level alterations significantly impacted transcript levels, and that the disrupted samples were characterized by expression changes associated with poor outcome in other cancers. Arm-level losses of chromosomes 1, 2, 11, and 18 were significantly recurrent. No significantly recurrent mutations were identified, suggesting no genes are altered by exonic mutations across large fractions of pituitary macroadenomas.
These data indicate that sporadic pituitary adenomas have distinct copy-number profiles that associate with hormonal and histologic subtypes and influence gene expression.
.
Along with traditional effects of aging and carcinogen exposure-inherited DNA variation has substantial contribution to cancer risk. Extraordinary progress made in analysis of common variation with ...GWAS methodology does not provide sufficient resolution to understand rare variation. To fulfill missing classification for rare germline variation we assembled dataset of whole exome sequences from>2000 patients (selected cases tested negative for candidate genes and unselected cases) with different types of cancers (breast cancer, colon cancer, and cutaneous and ocular melanomas) matched to more than 7000 non-cancer controls and analyzed germline variation in known cancer predisposing genes to identify common properties of disease-associated DNA variation and aid the future searches for new cancer susceptibility genes. Cancer predisposing genes were divided into non-overlapping classes according to the mode of inheritance of the related cancer syndrome or known tumor suppressor activity. Out of all classes only genes linked to dominant syndromes presented significant rare germline variants enrichment in cases. Separate analysis of protein-truncating and missense variation in this list of genes confirmed significant prevalence of protein-truncating variants in cases only in loss-of-function tolerant genes (pLI < 0.1), while ultra-rare missense variants were significantly overrepresented in cases only in constrained genes (pLI > 0.9). In addition to findings in genetically enriched cases, we observed significant burden of rare variation in unselected cases, suggesting substantial role of inherited variation even in relatively late cancer manifestation. Taken together, our findings provide reference for distribution and types of DNA variation underlying inherited predisposition to some common cancer types.
Alterations in DNA repair pathways are common in tumors and can result in characteristic mutational signatures; however, a specific mutational signature associated with somatic alterations in the ...nucleotide- excision repair (NER) pathway has not yet been identified. Here we examine the mutational processes operating in urothelial cancer, a tumor type in which the core NER gene ERCC2 is significantly mutated. Analysis of three independent urothelial tumor cohorts demonstrates a strong association between somatic ERCC2 mutations and the activity of a mutational signature characterized by a broad spectrum of base changes. In addition, we note an association between the activity of this signature and smoking that is independent of ERCC2 mutation status, providing genomic evidence of tobacco-related mutagenesis in urothelial cancer. Together, these analyses identify an NER-related mutational signature and highlight the related roles of DNA damage and subsequent DNA repair in shaping tumor mutational landscape.
Biallelic inactivation of BRCA1 or BRCA2 is associated with a pattern of genome-wide mutations known as signature 3. By analyzing ∼1,000 breast cancer samples, we confirmed this association and ...established that germline nonsense and frameshift variants in PALB2, but not in ATM or CHEK2, can also give rise to the same signature. We were able to accurately classify missense BRCA1 or BRCA2 variants known to impair homologous recombination (HR) on the basis of this signature. Finally, we show that epigenetic silencing of RAD51C and BRCA1 by promoter methylation is strongly associated with signature 3 and, in our data set, was highly enriched in basal-like breast cancers in young individuals of African descent.
Genomic analysis of tumours has led to the identification of hundreds of cancer genes on the basis of the presence of mutations in protein-coding regions. By contrast, much less is known about ...cancer-causing mutations in non-coding regions. Here we perform deep sequencing in 360 primary breast cancers and develop computational methods to identify significantly mutated promoters. Clear signals are found in the promoters of three genes. FOXA1, a known driver of hormone-receptor positive breast cancer, harbours a mutational hotspot in its promoter leading to overexpression through increased E2F binding. RMRP and NEAT1, two non-coding RNA genes, carry mutations that affect protein binding to their promoters and alter expression levels. Our study shows that promoter regions harbour recurrent mutations in cancer with functional consequences and that the mutations occur at similar frequencies as in coding regions. Power analyses indicate that more such regions remain to be discovered through deep sequencing of adequately sized cohorts of patients.
Genomic databases of allele frequency are extremely helpful for evaluating clinical variants of unknown significance; however, until now, databases such as the Genome Aggregation Database (gnomAD) ...have focused on nuclear DNA and have ignored the mitochondrial genome (mtDNA). Here, we present a pipeline to call mtDNA variants that addresses three technical challenges: (1) detecting homoplasmic and heteroplasmic variants, present, respectively, in all or a fraction of mtDNA molecules; (2) circular mtDNA genome; and (3) misalignment of nuclear sequences of mitochondrial origin (NUMTs). We observed that mtDNA copy number per cell varied across gnomAD cohorts and influenced the fraction of NUMT-derived false-positive variant calls, which can account for the majority of putative heteroplasmies. To avoid false positives, we excluded contaminated samples, cell lines, and samples prone to NUMT misalignment due to few mtDNA copies. Furthermore, we report variants with heteroplasmy ≥10%. We applied this pipeline to 56,434 whole-genome sequences in the gnomAD v3.1 database that includes individuals of European (58%), African (25%), Latino (10%), and Asian (5%) ancestry. Our gnomAD v3.1 release contains population frequencies for 10,850 unique mtDNA variants at more than half of all mtDNA bases. Importantly, we report frequencies within each nuclear ancestral population and mitochondrial haplogroup. Homoplasmic variants account for most variant calls (98%) and unique variants (85%). We observed that 1/250 individuals carry a pathogenic mtDNA variant with heteroplasmy above 10%. These mtDNA population allele frequencies are freely accessible and will aid in diagnostic interpretation and research studies.
Genome-wide association studies have successfully discovered thousands of common variants associated with human diseases and traits, but the landscape of rare variations in human disease has not been ...explored at scale. Exome-sequencing studies of population biobanks provide an opportunity to systematically evaluate the impact of rare coding variations across a wide range of phenotypes to discover genes and allelic series relevant to human health and disease. Here, we present results from systematic association analyses of 4,529 phenotypes using single-variant and gene tests of 394,841 individuals in the UK Biobank with exome-sequence data. We find that the discovery of genetic associations is tightly linked to frequency and is correlated with metrics of deleteriousness and natural selection. We highlight biological findings elucidated by these data and release the dataset as a public resource alongside the Genebass browser for rapidly exploring rare-variant association results.
Display omitted
•Public release of gene-based association statistics for 4,529 diseases and traits•Genebass, a browser framework to display rare-variant associations•Tight coupling between frequency, natural selection, and power for genetic discovery•Biological signal between SCRIB and white-matter integrity (from MRI)
Karczewski et al. generated a massive-scale association dataset between rare genetic mutations and thousands of diseases and traits and released these data in the Genebass browser. They quantify the influence of natural selection and gene function on association discovery and highlight an association between SCRIB and a brain-imaging trait.