Various species of the intestinal microbiota have been associated with the development of colorectal cancer
, but it has not been demonstrated that bacteria have a direct role in the occurrence of ...oncogenic mutations. Escherichia coli can carry the pathogenicity island pks, which encodes a set of enzymes that synthesize colibactin
. This compound is believed to alkylate DNA on adenine residues
and induces double-strand breaks in cultured cells
. Here we expose human intestinal organoids to genotoxic pks
E. coli by repeated luminal injection over five months. Whole-genome sequencing of clonal organoids before and after this exposure revealed a distinct mutational signature that was absent from organoids injected with isogenic pks-mutant bacteria. The same mutational signature was detected in a subset of 5,876 human cancer genomes from two independent cohorts, predominantly in colorectal cancer. Our study describes a distinct mutational signature in colorectal cancer and implies that the underlying mutational process results directly from past exposure to bacteria carrying the colibactin-producing pks pathogenicity island.
Several strands of evidence question the dogma that human mitochondrial DNA (mtDNA) is inherited exclusively down the maternal line, most recently in three families where several individuals harbored ...a 'heteroplasmic haplotype' consistent with biparental transmission. Here we report a similar genetic signature in 7 of 11,035 trios, with allelic fractions of 5-25%, implying biparental inheritance of mtDNA in 0.06% of offspring. However, analysing the nuclear whole genome sequence, we observe likely large rare or unique nuclear-mitochondrial DNA segments (mega-NUMTs) transmitted from the father in all 7 families. Independently detecting mega-NUMTs in 0.13% of fathers, we see autosomal transmission of the haplotype. Finally, we show the haplotype allele fraction can be explained by complex concatenated mtDNA-derived sequences rearranged within the nuclear genome. We conclude that rare cryptic mega-NUMTs can resemble paternally mtDNA heteroplasmy, but find no evidence of paternal transmission of mtDNA in humans.
Bronchiectasis can result from infectious, genetic, immunological and allergic causes. 60-80% of cases are idiopathic, but a well-recognised genetic cause is the motile ciliopathy, primary ciliary ...dyskinesia (PCD). Diagnosis of PCD has management implications including addressing comorbidities, implementing genetic and fertility counselling and future access to PCD-specific treatments. Diagnostic testing can be complex; however, PCD genetic testing is moving rapidly from research into clinical diagnostics and would confirm the cause of bronchiectasis.
This observational study used genetic data from severe bronchiectasis patients recruited to the UK 100,000 Genomes Project and patients referred for gene panel testing within a tertiary respiratory hospital. Patients referred for genetic testing due to clinical suspicion of PCD were excluded from both analyses. Data were accessed from the British Thoracic Society audit, to investigate whether motile ciliopathies are underdiagnosed in people with bronchiectasis in the UK.
Pathogenic or likely pathogenic variants were identified in motile ciliopathy genes in 17 (12%) out of 142 individuals by whole-genome sequencing. Similarly, in a single centre with access to pathological diagnostic facilities, 5-10% of patients received a PCD diagnosis by gene panel, often linked to normal/inconclusive nasal nitric oxide and cilia functional test results. In 4898 audited patients with bronchiectasis, <2% were tested for PCD and <1% received genetic testing.
PCD is underdiagnosed as a cause of bronchiectasis. Increased uptake of genetic testing may help to identify bronchiectasis due to motile ciliopathies and ensure appropriate management.
Whole-genome sequencing (WGS) of cancers is becoming an accepted component of oncological care, and NHS England is currently rolling out WGS for all children with cancer. This approach was piloted ...during the 100,000 genomes (100 K) project. Here we share the experience of the East of England Genomic Medicine Centre (East-GMC), reporting the feasibility and clinical utility of centralised WGS for individual children locally.
Non-consecutive children with solid tumours were recruited into the pilot 100 K project at our Genomic Medicine Centre. Variant catalogues were returned for local scrutiny and appraisal at dedicated genomic tumour advisory boards with an emphasis on a detailed exploration of potential clinical value.
Thirty-six children, representing one-sixth of the national 100 K cohort, were recruited through our Genomic Medicine Centre. The diagnoses encompassed 23 different solid tumour types and WGS provided clinical utility, beyond standard-of-care assays, by refining (2/36) or changing (4/36) diagnoses, providing prognostic information (8/36), defining pathogenic germline mutations (1/36) or revealing novel therapeutic opportunities (8/36).
Our findings demonstrate the feasibility and clinical value of centralised WGS for children with cancer. WGS offered additional clinical value, especially in diagnostic terms. However, our experience highlights the need for local expertise in scrutinising and clinically interpreting centrally derived variant calls for individual children.
Background
Genome sequencing was first offered clinically in the UK through the 100,000 Genomes Project (100KGP). Analysis was restricted to predefined gene panels associated with the patient’s ...phenotype. However, panels rely on clearly characterised phenotypes and risk missing diagnoses outside of the panel(s) applied. We propose a complementary method to rapidly identify pathogenic variants, including those missed by 100KGP methods.
Methods
The Loss-of-function Observed/Expected Upper-bound Fraction (LOEUF) score quantifies gene constraint, with low scores correlated with haploinsufficiency. We applied DeNovoLOEUF, a filtering strategy to sequencing data from 13,949 rare disease trios in the 100KGP, by filtering for rare, de novo, loss-of-function variants in disease genes with a LOEUF score < 0.2. We compared our findings with the corresponding patient’s diagnostic reports.
Results
324/332 (98%) of the variants identified using DeNovoLOEUF were diagnostic or partially diagnostic (whereby the variant was responsible for some of the phenotype). We identified 39 diagnoses that were “missed” by 100KGP standard analyses, which are now being returned to patients.
Conclusion
We have demonstrated a highly specific and rapid method with a 98% positive predictive value that has good concordance with standard analysis, low false-positive rate, and can identify additional diagnoses. Globally, as more patients are being offered genome sequencing, we anticipate that DeNovoLOEUF will rapidly identify new diagnoses and facilitate iterative analyses when new disease genes are discovered.
Multi-locus Inherited Neoplasia Allele Syndrome (MINAS) refers to individuals with germline pathogenic variants in two or more cancer susceptibility genes(CSGs). With increased use of exome/genome ...sequencing it would be predicted that detection of MINAS would become more frequent. Here we review recent progress in knowledge of MINAS. A systematic literature search for reports of individuals with germline pathogenic variants in 2 or more of 94 CSGs was performed. In addition, participants with multiple primary tumours who underwent genome sequencing as part of the Rare Disease arm of the UK 100,000 Genomes Project were interrogated to detect additional cases. We identified 385 MINAS cases (211 reported in the last 5 years, 6 from 100,000 genomes participants). Most (287/385) cases contained at least one pathogenic variant in either BRCA1 or BRCA2. 108/385 MINAS cases had multiple primary tumours at presentation and a subset of cases presented unusual multiple tumour phenotypes. We conclude that, as predicted, increasing numbers of individuals with MINAS are being have been reported but, except for individuals with BRCA1/BRCA2 MINAS, individual CSG combinations are generally rare. In many cases it appears that the clinical phenotype is that which would be expected from the effects of the constituent CSG variants acting independently. However, in some instances the presence of unusual tumour phenotypes and/or multiple primary tumours suggests that there may be complex interactions between the relevant MINAS CSGs. Systematic reporting of MINAS cases in a MINAS database (e.g. https://databases.lovd.nl/shared/diseases/04296 ) will facilitate more accurate prognostic predictions for specific CSG combinations.
An important fraction of patients with rare disorders remains with no clear genetic diagnostic, even after whole-exome or whole-genome sequencing, posing a difficulty in giving adequate treatment and ...genetic counseling. The analysis of genomic data in rare disorders mostly considers the presence of single gene variants in coding regions that follow a concrete monogenic mode of inheritance. A digenic inheritance, with variants in two functionally-related genes in the same individual, is a plausible alternative that might explain the genetic basis of the disease in some cases. In this case, digenic disease combinations should be absent or underrepresented in healthy individuals. We develop a framework to evaluate the significance of digenic combinations and test its statistical power in different scenarios. We suggest that this approach will be relevant with the advent of new sequencing efforts including hundreds of thousands of samples.
X-linked Alport syndrome is a genetic kidney disease caused by pathogenic COL4A5 variants, but little is known of the consequences of missense variants affecting the NC1 domain of the corresponding ...collagen IV α5 chain. This study examined these variants in a normal (gnomAD) and other databases (LOVD, Clin Var and 100,000 Genomes Project) to determine their pathogenicity and clinical significance. Males with Cys substitutions in the collagen IV α5 NC1 domain reported in LOVD (n = 25) were examined for typical Alport features, including age at kidney failure. All NC1 variants in LOVD (n = 86) were then assessed for structural damage using an online computational tool, Missense3D. Variants in the ClinVar, gnomAD and 100,000 Genomes Project databases were also examined for structural effects. Predicted damage associated with NC1 substitutions was then correlated with the level of conservation of the affected residues. Cys substitutions in males were associated with the typical features of X-linked Alport syndrome, with a median age at kidney failure of 31 years. NC1 substitutions predicted to cause structural damage were overrepresented in LOVD (p < 0.001), and those affecting Cys residues or 'buried' Gly residues were more common than expected (both p < 0.001). Most NC1 substitutions in gnomAD (88%) were predicted to be structurally-neutral. Substitutions affecting conserved residues resulted in more structural damage than those affecting non-conserved residues (p < 0.001). Many pathogenic missense variants affecting the collagen IV α5 NC1 domain have their effect through molecular structural damage and 3D modelling is a useful tool in their assessment.
Autosomal recessive whole gene deletions of nephrocystin-1 (NPHP1) result in abnormal structure and function of the primary cilia. These deletions can result in a tubulointerstitial kidney disease ...known as nephronophthisis and retinal (Senior-Løken syndrome) and neurological (Joubert syndrome) diseases. Nephronophthisis is a common cause of end-stage kidney disease (ESKD) in children and up to 1% of adult onset ESKD. Single nucleotide variants (SNVs) and small insertions and deletions (Indels) have been less well characterised. We used a gene pathogenicity scoring system (GenePy) and a genotype-to-phenotype approach on individuals recruited to the UK Genomics England (GEL) 100,000 Genomes Project (100kGP) (n = 78,050). This approach identified all participants with NPHP1-related diseases reported by NHS Genomics Medical Centres and an additional eight participants. Extreme NPHP1 gene scores, often underpinned by clear recessive inheritance, were observed in patients from diverse recruitment categories, including cancer, suggesting the possibility of a more widespread disease than previously appreciated. In total, ten participants had homozygous CNV deletions with eight homozygous or compound heterozygous with SNVs. Our data also reveals strong in-silico evidence that approximately 44% of NPHP1 related disease may be due to SNVs with AlphaFold structural modelling evidence for a significant impact on protein structure. This study suggests historical under-reporting of SNVS in NPHP1 related diseases compared with CNVs.
Cancer genome sequencing enables accurate classification of tumours and tumour subtypes. However, prediction performance is still limited using exome-only sequencing and for tumour types with low ...somatic mutation burden such as many paediatric tumours. Moreover, the ability to leverage deep representation learning in discovery of tumour entities remains unknown.
We introduce here Mutation-Attention (MuAt), a deep neural network to learn representations of simple and complex somatic alterations for prediction of tumour types and subtypes. In contrast to many previous methods, MuAt utilizes the attention mechanism on individual mutations instead of aggregated mutation counts.
We trained MuAt models on 2587 whole cancer genomes (24 tumour types) from the Pan-Cancer Analysis of Whole Genomes (PCAWG) and 7352 cancer exomes (20 types) from the Cancer Genome Atlas (TCGA). MuAt achieved prediction accuracy of 89% for whole genomes and 64% for whole exomes, and a top-5 accuracy of 97% and 90%, respectively. MuAt models were found to be well-calibrated and perform well in three independent whole cancer genome cohorts with 10,361 tumours in total. We show MuAt to be able to learn clinically and biologically relevant tumour entities including acral melanoma, SHH-activated medulloblastoma, SPOP-associated prostate cancer, microsatellite instability, POLE proofreading deficiency, and MUTYH-associated pancreatic endocrine tumours without these tumour subtypes and subgroups being provided as training labels. Finally, scrunity of MuAt attention matrices revealed both ubiquitous and tumour-type specific patterns of simple and complex somatic mutations.
Integrated representations of somatic alterations learnt by MuAt were able to accurately identify histological tumour types and identify tumour entities, with potential to impact precision cancer medicine.