Genome-wide association studies have been successful in identifying common variants that impact complex human traits and diseases. However, despite this success, the joint effects of these variants ...explain only a small proportion of the genetic variance in these phenotypes, leading to speculation that rare genetic variation might account for much of the 'missing heritability'. Consequently, there has been an exciting period of research and development into the methodology for the analysis of rare genetic variants, typically by considering their joint effects on complex traits within the same functional unit or genomic region. In this review, we describe a general framework for modelling the joint effects of rare genetic variants on complex traits in association studies of unrelated individuals. We summarise a range of widely used association tests that have been developed from this model and provide an overview of the relative performance of these approaches from published simulation studies.
Various species of the intestinal microbiota have been associated with the development of colorectal cancer
, but it has not been demonstrated that bacteria have a direct role in the occurrence of ...oncogenic mutations. Escherichia coli can carry the pathogenicity island pks, which encodes a set of enzymes that synthesize colibactin
. This compound is believed to alkylate DNA on adenine residues
and induces double-strand breaks in cultured cells
. Here we expose human intestinal organoids to genotoxic pks
E. coli by repeated luminal injection over five months. Whole-genome sequencing of clonal organoids before and after this exposure revealed a distinct mutational signature that was absent from organoids injected with isogenic pks-mutant bacteria. The same mutational signature was detected in a subset of 5,876 human cancer genomes from two independent cohorts, predominantly in colorectal cancer. Our study describes a distinct mutational signature in colorectal cancer and implies that the underlying mutational process results directly from past exposure to bacteria carrying the colibactin-producing pks pathogenicity island.
The development of computational methods to assess pathogenicity of pre-messenger RNA splicing variants is critical for diagnosis of human disease. We assessed the capability of eight algorithms, and ...a consensus approach, to prioritize 249 variants of uncertain significance (VUSs) that underwent splicing functional analyses. The capability of algorithms to differentiate VUSs away from the immediate splice site as being 'pathogenic' or 'benign' is likely to have substantial impact on diagnostic testing. We show that SpliceAI is the best single strategy in this regard, but that combined usage of tools using a weighted approach can increase accuracy further. We incorporated prioritization strategies alongside diagnostic testing for rare disorders. We show that 15% of 2783 referred individuals carry rare variants expected to impact splicing that were not initially identified as 'pathogenic' or 'likely pathogenic'; one in five of these cases could lead to new or refined diagnoses.
Many genetic testing methodologies are biased towards picking up structural variants (SVs) that alter copy number. Copy-neutral rearrangements such as inversions are therefore likely to suffer from ...underascertainment. In this study, manual review prompted by a virtual multidisciplinary team meeting and subsequent bioinformatic prioritisation of data from the 100K Genomes Project was performed across 43 genes linked to well-characterised skeletal disorders. Ten individuals from three independent families were found to harbour diagnostic inversions. In two families, inverted segments of 1.2/14.8 Mb unequivocally disrupted GLI3 and segregated with skeletal features consistent with Greig cephalopolysyndactyly syndrome. For one family, phenotypic blending was due to the opposing breakpoint lying ~45 kb from HOXA13. In the third family, long suspected to have Marfan syndrome, a 2.0 Mb inversion disrupting FBN1 was identified. These findings resolved lengthy diagnostic odysseys of 9–20 years and highlight the importance of direct interaction between clinicians and data-analysts. These exemplars of a rare mutational class inform future SV prioritisation strategies within the NHS Genomic Medicine Service and similar genome sequencing initiatives. In over 30 years since these two disease-gene associations were identified, large inversions have yet to be described and so our results extend the mutational spectra linked to these conditions.
Cilia are highly specialized cellular organelles that serve multiple functions in human development and health. Their central importance in the body is demonstrated by the occurrence of a diverse ...range of developmental disorders that arise from defects of cilia structure and function, caused by a range of different inherited mutations found in more than 150 different genes. Genetic analysis has rapidly advanced our understanding of the cell biological basis of ciliopathies over the past two decades, with more recent technological advances in genomics rapidly accelerating this progress. The 100,000 Genomes Project was launched in 2012 in the UK to improve diagnosis and future care for individuals affected by rare diseases like ciliopathies, through whole genome sequencing (WGS). In this review we discuss the potential promise and medical impact of WGS for ciliopathies and report on current progress of the 100,000 Genomes Project, reviewing the medical, technical and ethical challenges and opportunities that new, large scale initiatives such as this can offer.
BackgroundGenomic variant prioritisation is one of the most significant bottlenecks to mainstream genomic testing in healthcare. Tools to improve precision while ensuring high recall are critical to ...successful mainstream clinical genomic testing, in particular for whole genome sequencing where millions of variants must be considered for each patient.MethodsWe developed EyeG2P, a publicly available database and web application using the Ensembl Variant Effect Predictor. EyeG2P is tailored for efficient variant prioritisation for individuals with inherited ophthalmic conditions. We assessed the sensitivity of EyeG2P in 1234 individuals with a broad range of eye conditions who had previously received a confirmed molecular diagnosis through routine genomic diagnostic approaches. For a prospective cohort of 83 individuals, we assessed the precision of EyeG2P in comparison with routine diagnostic approaches. For 10 additional individuals, we assessed the utility of EyeG2P for whole genome analysis.ResultsEyeG2P had 99.5% sensitivity for genomic variants previously identified as clinically relevant through routine diagnostic analysis (n=1234 individuals). Prospectively, EyeG2P enabled a significant increase in precision (35% on average) in comparison with routine testing strategies (p<0.001). We demonstrate that incorporation of EyeG2P into whole genome sequencing analysis strategies can reduce the number of variants for analysis to six variants, on average, while maintaining high diagnostic yield.ConclusionAutomated filtering of genomic variants through EyeG2P can increase the efficiency of diagnostic testing for individuals with a broad range of inherited ophthalmic disorders.
BackgroundThe 100 000 Genomes Project (100K) recruited National Health Service patients with eligible rare diseases and cancer between 2016 and 2018. PanelApp virtual gene panels were applied to ...whole genome sequencing data according to Human Phenotyping Ontology (HPO) terms entered by recruiting clinicians to guide focused analysis.MethodsWe developed a reverse phenotyping strategy to identify 100K participants with pathogenic variants in nine prioritised disease genes (BBS1, BBS10, ALMS1, OFD1, DYNC2H1, WDR34, NPHP1, TMEM67, CEP290), representative of the full phenotypic spectrum of multisystemic primary ciliopathies. We mapped genotype data ‘backwards’ onto available clinical data to assess potential matches against phenotypes. Participants with novel molecular diagnoses and key clinical features compatible with the identified disease gene were reported to recruiting clinicians.ResultsWe identified 62 reportable molecular diagnoses with variants in these nine ciliopathy genes. Forty-four have been reported by 100K, 5 were previously unreported and 13 are new diagnoses. We identified 11 participants with unreportable, novel molecular diagnoses, who lacked key clinical features to justify reporting to recruiting clinicians. Two participants had likely pathogenic structural variants and one a deep intronic predicted splice variant. These variants would not be prioritised for review by standard 100K diagnostic pipelines.ConclusionReverse phenotyping improves the rate of successful molecular diagnosis for unsolved 100K participants with primary ciliopathies. Previous analyses likely missed these diagnoses because incomplete HPO term entry led to incorrect gene panel choice, meaning that pathogenic variants were not prioritised. Better phenotyping data are therefore essential for accurate variant interpretation and improved patient benefit.