Abstract
Summary
We describe a novel computational method for genotyping repeats using sequence graphs. This method addresses the long-standing need to accurately genotype medically important loci ...containing repeats adjacent to other variants or imperfect DNA repeats such as polyalanine repeats. Here we introduce a new version of our repeat genotyping software, ExpansionHunter, that uses this method to perform targeted genotyping of a broad class of such loci.
Availability and implementation
ExpansionHunter is implemented in C++ and is available under the Apache License Version 2.0. The source code, documentation, and Linux/macOS binaries are available at https://github.com/Illumina/ExpansionHunter/.
Supplementary information
Supplementary data are available at Bioinformatics online.
The hereditary spastic paraplegias are a heterogeneous group of degenerative disorders that are clinically classified as either pure with predominant lower limb spasticity, or complex where spastic ...paraplegia is complicated with additional neurological features, and are inherited in autosomal dominant, autosomal recessive or X-linked patterns. Genetic defects have been identified in over 40 different genes, with more than 70 loci in total. Complex recessive spastic paraplegias have in the past been frequently associated with mutations in SPG11 (spatacsin), ZFYVE26/SPG15, SPG7 (paraplegin) and a handful of other rare genes, but many cases remain genetically undefined. The overlap with other neurodegenerative disorders has been implied in a small number of reports, but not in larger disease series. This deficiency has been largely due to the lack of suitable high throughput techniques to investigate the genetic basis of disease, but the recent availability of next generation sequencing can facilitate the identification of disease-causing mutations even in extremely heterogeneous disorders. We investigated a series of 97 index cases with complex spastic paraplegia referred to a tertiary referral neurology centre in London for diagnosis or management. The mean age of onset was 16 years (range 3 to 39). The SPG11 gene was first analysed, revealing homozygous or compound heterozygous mutations in 30/97 (30.9%) of probands, the largest SPG11 series reported to date, and by far the most common cause of complex spastic paraplegia in the UK, with severe and progressive clinical features and other neurological manifestations, linked with magnetic resonance imaging defects. Given the high frequency of SPG11 mutations, we studied the autophagic response to starvation in eight affected SPG11 cases and control fibroblast cell lines, but in our restricted study we did not observe correlations between disease status and autophagic or lysosomal markers. In the remaining cases, next generation sequencing was carried out revealing variants in a number of other known complex spastic paraplegia genes, including five in SPG7 (5/97), four in FA2H (also known as SPG35) (4/97) and two in ZFYVE26/SPG15 Variants were identified in genes usually associated with pure spastic paraplegia and also in the Parkinson's disease-associated gene ATP13A2, neuronal ceroid lipofuscinosis gene TPP1 and the hereditary motor and sensory neuropathy DNMT1 gene, highlighting the genetic heterogeneity of spastic paraplegia. No plausible genetic cause was identified in 51% of probands, likely indicating the existence of as yet unidentified genes.
Hereditary spastic paraplegia (HSP) is a group of heterogeneous inherited degenerative disorders characterized by lower limb spasticity. Fifty percent of HSP patients remain yet genetically ...undiagnosed. The 100,000 Genomes Project (100KGP) is a large UK-wide initiative to provide genetic diagnosis to previously undiagnosed patients and families with rare conditions. Over 400 HSP families were recruited to the 100KGP. In order to obtain genetic diagnoses, gene-based burden testing was carried out for rare, predicted pathogenic variants using candidate variants from the Exomiser analysis of the genome sequencing data. A significant gene-disease association was identified for UBAP1 and HSP. Three protein truncating variants were identified in 13 patients from 7 families. All patients presented with juvenile form of pure HSP, with median age at onset 10 years, showing autosomal dominant inheritance or de novo occurrence. Additional clinical features included parkinsonism and learning difficulties, but their association with UBAP1 needs to be established.
Hereditary Spastic Paraplegia (HSP) is a syndrome characterised by lower limb spasticity, occurring alone or in association with other neurological manifestations, such as cognitive impairment, ...seizures, ataxia or neuropathy. HSP occurs worldwide, with different populations having different frequencies of causative genes. The Greek population has not yet been characterised. The purpose of this study was to describe the clinical presentation and molecular epidemiology of the largest cohort of HSP in Greece, comprising 54 patients from 40 families. We used a targeted next-generation sequencing (NGS) approach to genetically assess a proband from each family. We made a genetic diagnosis in >50% of cases and identified 11 novel variants. Variants in SPAST and KIF5A were the most common causes of autosomal dominant HSP, whereas SPG11 and CYP7B1 were the most common cause of autosomal recessive HSP. We identified a novel variant in SPG11, which led to disease with later onset and may be unique to the Greek population and report the first nonsense mutation in KIF5A. Interestingly, the frequency of HSP mutations in the Greek population, which is relatively isolated, was very similar to other European populations. We confirm that NGS approaches are an efficient diagnostic tool and should be employed early in the assessment of HSP patients.
Expansions of short tandem repeats are the cause of many neurogenetic disorders including familial amyotrophic lateral sclerosis, Huntington disease, and many others. Multiple methods have been ...recently developed that can identify repeat expansions in whole genome or exome sequencing data. Despite the widely recognized need for visual assessment of variant calls in clinical settings, current computational tools lack the ability to produce such visualizations for repeat expansions. Expanded repeats are difficult to visualize because they correspond to large insertions relative to the reference genome and involve many misaligning and ambiguously aligning reads.
We implemented REViewer, a computational method for visualization of sequencing data in genomic regions containing long repeat expansions and FlipBook, a companion image viewer designed for manual curation of large collections of REViewer images. To generate a read pileup, REViewer reconstructs local haplotype sequences and distributes reads to these haplotypes in a way that is most consistent with the fragment lengths and evenness of read coverage. To create appropriate training materials for onboarding new users, we performed a concordance study involving 12 scientists involved in short tandem repeat research. We used the results of this study to create a user guide that describes the basic principles of using REViewer as well as a guide to the typical features of read pileups that correspond to low confidence repeat genotype calls. Additionally, we demonstrated that REViewer can be used to annotate clinically relevant repeat interruptions by comparing visual assessment results of 44 FMR1 repeat alleles with the results of triplet repeat primed PCR. For 38 of these alleles, the results of visual assessment were consistent with triplet repeat primed PCR.
Read pileup plots generated by REViewer offer an intuitive way to visualize sequencing data in regions containing long repeat expansions. Laboratories can use REViewer and FlipBook to assess the quality of repeat genotype calls as well as to visually detect interruptions or other imperfections in the repeat sequence and the surrounding flanking regions. REViewer and FlipBook are available under open-source licenses at https://github.com/illumina/REViewer and https://github.com/broadinstitute/flipbook respectively.
AMPA-type glutamate receptors (AMPARs) are postsynaptic ionotropic receptors which mediate fast excitatory currents. AMPARs have a heterotetrameric structure, variably composed by the four subunits ...GluA1-4 which are encoded by genes
GRIA1
-
4
. Increasing evidence support the role of pathogenic variants in
GRIA1-4
genes as causative for syndromic intellectual disability (ID). We report an Italian pedigree where some male individuals share ID, seizures and facial dysmorphisms. The index subject was referred for severe ID, myoclonic seizures, cerebellar signs and short stature. Whole exome sequencing identified a novel variant in
GRIA3
, c.2360A > G, p.(Glu787Gly). The
GRIA3
gene maps to chromosome Xq25 and the c.2360A > G variant was transmitted by his healthy mother. Subsequent analysis in the family showed a segregation pattern compatible with the causative role of this variant, further supported by preliminary functional insights. We provide a detailed description of the clinical evolution of the index subjects and stress the relevance of myoclonic seizures and cerebellar syndrome as cardinal features of his presentation.