Whole-genome sequencing (WGS) is becoming widely used in clinical medicine in diagnostic contexts and to inform treatment choice. Here we evaluate the potential of the Oxford Nanopore Technologies ...(ONT) MinION long-read sequencer for routine WGS by sequencing the reference sample NA12878 and the genome of an individual with ataxia-pancytopenia syndrome and severe immune dysregulation. We develop and apply a novel reference panel-free analytical method to infer and then exploit phase information which improves single-nucleotide variant (SNV) calling performance from otherwise modest levels. In the clinical sample, we identify and directly phase two non-synonymous de novo variants in SAMD9L, (OMIM #159550) inferring that they lie on the same paternal haplotype. Whilst consensus SNV-calling error rates from ONT data remain substantially higher than those from short-read methods, we demonstrate the substantial benefits of analytical innovation. Ongoing improvements to base-calling and SNV-calling methodology must continue for nanopore sequencing to establish itself as a primary method for clinical WGS.
The translation of genome sequencing into routine health care has been slow, partly because of concerns about affordability. The aspirational cost of sequencing a genome is $1000, but there is little ...evidence to support this estimate. We estimate the cost of using genome sequencing in routine clinical care in patients with cancer or rare diseases.
We performed a microcosting study of Illumina-based genome sequencing in a UK National Health Service laboratory processing 399 samples/year. Cost data were collected for all steps in the sequencing pathway, including bioinformatics analysis and reporting of results. Sensitivity analysis identified key cost drivers.
Genome sequencing costs £6841 per cancer case (comprising matched tumor and germline samples) and £7050 per rare disease case (three samples). The consumables used during sequencing are the most expensive component of testing (68-72% of the total cost). Equipment costs are higher for rare disease cases, whereas consumable and staff costs are slightly higher for cancer cases.
The cost of genome sequencing is underestimated if only sequencing costs are considered, and likely surpasses $1000/genome in a single laboratory. This aspirational sequencing cost will likely only be achieved if consumable costs are considerably reduced and sequencing is performed at scale.
Minichromosome maintenance protein 10 (MCM10) is essential for eukaryotic DNA replication. Here, we describe compound heterozygous MCM10 variants in patients with distinctive, but overlapping, ...clinical phenotypes: natural killer (NK) cell deficiency (NKD) and restrictive cardiomyopathy (RCM) with hypoplasia of the spleen and thymus. To understand the mechanism of MCM10-associated disease, we modeled these variants in human cell lines. MCM10 deficiency causes chronic replication stress that reduces cell viability due to increased genomic instability and telomere erosion. Our data suggest that loss of MCM10 function constrains telomerase activity by accumulating abnormal replication fork structures enriched with single-stranded DNA. Terminally-arrested replication forks in MCM10-deficient cells require endonucleolytic processing by MUS81, as MCM10:MUS81 double mutants display decreased viability and accelerated telomere shortening. We propose that these bi-allelic variants in MCM10 predispose specific cardiac and immune cell lineages to prematurely arrest during differentiation, causing the clinical phenotypes observed in both NKD and RCM patients.
To characterize the molecular genetics of autosomal recessive Noonan syndrome.
Families underwent phenotyping for features of Noonan syndrome in children and their parents. Two multiplex families ...underwent linkage analysis. Exome, genome, or multigene panel sequencing was used to identify variants. The molecular consequences of observed splice variants were evaluated by reverse-transcription polymerase chain reaction.
Twelve families with a total of 23 affected children with features of Noonan syndrome were evaluated. The phenotypic range included mildly affected patients, but it was lethal in some, with cardiac disease and leukemia. All of the parents were unaffected. Linkage analysis using a recessive model supported a candidate region in chromosome 22q11, which includes LZTR1, previously shown to harbor mutations in patients with Noonan syndrome inherited in a dominant pattern. Sequencing analyses of 21 live-born patients and a stillbirth identified biallelic pathogenic variants in LZTR1, including putative loss-of-function, missense, and canonical and noncanonical splicing variants in the affected children, with heterozygous, clinically unaffected parents and heterozygous or normal genotypes in unaffected siblings.
These clinical and genetic data confirm the existence of a form of Noonan syndrome that is inherited in an autosomal recessive pattern and identify biallelic mutations in LZTR1.
Recent advances in throughput and accuracy mean that the Oxford Nanopore Technologies PromethION platform is a now a viable solution for genome sequencing. Much of the validation of bioinformatic ...tools for this long-read data has focussed on calling germline variants (including structural variants). Somatic variants are outnumbered many-fold by germline variants and their detection is further complicated by the effects of tumour purity/subclonality. Here, we evaluate the extent to which Nanopore sequencing enables detection and analysis of somatic variation. We do this through sequencing tumour and germline genomes for a patient with diffuse B-cell lymphoma and comparing results with 150 bp short-read sequencing of the same samples. Calling germline single nucleotide variants (SNVs) from specific chromosomes of the long-read data achieved good specificity and sensitivity. However, results of somatic SNV calling highlight the need for the development of specialised joint calling algorithms. We find the comparative genome-wide performance of different tools varies significantly between structural variant types, and suggest long reads are especially advantageous for calling large somatic deletions and duplications. Finally, we highlight the utility of long reads for phasing clinically relevant variants, confirming that a somatic 1.6 Mb deletion and a p.(Arg249Met) mutation involving TP53 are oriented in trans.
Neurodevelopmental disorders with periventricular nodular heterotopia (PNH) are etiologically heterogeneous, and their genetic causes remain in many cases unknown. Here we show that missense ...mutations in NEDD4L mapping to the HECT domain of the encoded E3 ubiquitin ligase lead to PNH associated with toe syndactyly, cleft palate and neurodevelopmental delay. Cellular and expression data showed sensitivity of PNH-associated mutants to proteasome degradation. Moreover, an in utero electroporation approach showed that PNH-related mutants and excess wild-type NEDD4L affect neurogenesis, neuronal positioning and terminal translocation. Further investigations, including rapamycin-based experiments, found differential deregulation of pathways involved. Excess wild-type NEDD4L leads to disruption of Dab1 and mTORC1 pathways, while PNH-related mutations are associated with deregulation of mTORC1 and AKT activities. Altogether, these data provide insights into the critical role of NEDD4L in the regulation of mTOR pathways and their contributions in cortical development.
Display omitted
•Understanding how a genomic variant relates to pathogenicity is critical.•Protein destabilisation, alone, is often not a plausible explanation.•Nearby gnomAD variants and Uniprot ...annotations are often crucial for the hypothesis.•We have developed the Venus webapp to help formulate potential hypotheses.•Venus incorporates different pieces of information mapped onto structure.
Exploring the functional effect of a non-synonymous coding variant at the protein level requires multiple pieces of information to be interpreted appropriately. This is particularly important when embarking on the study of a potentially pathogenic variant linked to a rare or monogenic disease. Whereas accurate protein stability predictions alone are generally informative, other effects, such as disruption of post-translational modifications or weakened ligand binding, may also contribute to the disease phenotype. Furthermore, consideration of nearby variants that are found in the healthy population may strengthen or refute a given mechanistic hypothesis. Whilst there are several bioinformatics tools available that score a genetic variant in terms of deleteriousness, there is no single tool that assembles multiple effects of a variant on the encoded protein, beyond structural stability, and presents them on the structure for inspection.
Venus is a web application which, given a protein substitution, rapidly estimates the predicted effect on protein stability of the variant, flags if the variant affects a post-translational modification site, a predicted linear motif or known annotation, and determines the effect on protein stability of variants which affect nearby residues and have been identified in healthy populations. Venus is built upon Michelanglo and the results can be exported to it, allowing them to be annotated and shared with other researchers.
Venus is freely accessible at https://venus.cmd.ox.ac.uk and its source code is openly available at https://github.com/CMD-Oxford/Michelanglo-and-Venus.
Glycosylphophatidylinositol (GPI)-anchored proteins play important roles in many biological processes, and mutations affecting proteins involved in the synthesis of the GPI anchor are reported to ...cause a wide spectrum of intellectual disabilities (IDs) with characteristic additional phenotypic features. Here, we describe a total of five individuals (from three unrelated families) in whom we identified mutations in PGAP3, encoding a protein that is involved in GPI-anchor maturation. Three siblings in a consanguineous Pakistani family presented with profound developmental delay, severe ID, no speech, psychomotor delay, and postnatal microcephaly. A combination of autozygosity mapping and exome sequencing identified a 13.8 Mb region harboring a homozygous c.275G>A (p.Gly92Asp) variant in PGAP3 region 17q11.2–q21.32. Subsequent testing showed elevated serum alkaline phosphatase (ALP), a GPI-anchored enzyme, in all three affected children. In two unrelated individuals in a cohort with developmental delay, ID, and elevated ALP, we identified compound-heterozygous variants c.439dupC (p.Leu147Profs∗16) and c.914A>G (p.Asp305Gly) and homozygous variant c.314C>G (p.Pro105Arg). The 1 bp duplication causes a frameshift and nonsense-mediated decay. Further evidence supporting pathogenicity of the missense mutations c.275G>A, c.314C>G, and c.914A>G was provided by the absence of the variants from ethnically matched controls, phylogenetic conservation, and functional studies on Chinese hamster ovary cell lines. Taken together with recent data on PGAP2, these results confirm the importance of the later GPI-anchor remodelling steps for normal neuronal development. Impairment of PGAP3 causes a subtype of hyperphosphatasia with ID, a congenital disorder of glycosylation that is also referred to as Mabry syndrome.
Isolated complex I deficiency is a common biochemical phenotype observed in pediatric mitochondrial disease and often arises as a consequence of pathogenic variants affecting one of the ∼65 genes ...encoding the complex I structural subunits or assembly factors. Such genetic heterogeneity means that application of next-generation sequencing technologies to undiagnosed cohorts has been a catalyst for genetic diagnosis and gene-disease associations. We describe the clinical and molecular genetic investigations of four unrelated children who presented with neuroradiological findings and/or elevated lactate levels, highly suggestive of an underlying mitochondrial diagnosis. Next-generation sequencing identified bi-allelic variants in NDUFA6, encoding a 15 kDa LYR-motif-containing complex I subunit that forms part of the Q-module. Functional investigations using subjects’ fibroblast cell lines demonstrated complex I assembly defects, which were characterized in detail by mass-spectrometry-based complexome profiling. This confirmed a marked reduction in incorporated NDUFA6 and a concomitant reduction in other Q-module subunits, including NDUFAB1, NDUFA7, and NDUFA12. Lentiviral transduction of subjects’ fibroblasts showed normalization of complex I. These data also support supercomplex formation, whereby the ∼830 kDa complex I intermediate (consisting of the P- and Q-modules) is in complex with assembled complex III and IV holoenzymes despite lacking the N-module. Interestingly, RNA-sequencing data provided evidence that the consensus RefSeq accession number does not correspond to the predominant transcript in clinically relevant tissues, prompting revision of the NDUFA6 RefSeq transcript and highlighting not only the importance of thorough variant interpretation but also the assessment of appropriate transcripts for analysis.