The Genetics of Epilepsy Perucca, Piero; Bahlo, Melanie; Berkovic, Samuel F
Annual review of genomics and human genetics,
08/2020, Letnik:
21, Številka:
1
Journal Article
Recenzirano
Odprti dostop
Epilepsy encompasses a group of heterogeneous brain diseases that affect more than 50 million people worldwide. Epilepsy may have discernible structural, infectious, metabolic, and immune etiologies; ...however, in most people with epilepsy, no obvious cause is identifiable. Based initially on family studies and later on advances in gene sequencing technologies and computational approaches, as well as the establishment of large collaborative initiatives, we now know that genetics plays a much greater role in epilepsy than was previously appreciated. Here, we review the progress in the field of epilepsy genetics and highlight molecular discoveries in the most important epilepsy groups, including those that have been long considered to have a nongenetic cause. We discuss where the field of epilepsy genetics is moving as it enters a new era in which the genetic architecture of common epilepsies is starting to be unraveled.
Background: The commercially available 10x Genomics protocol to generate droplet-based single-cell RNA-seq (scRNA-seq) data is enjoying growing popularity among researchers. Fundamental to the ...analysis of such scRNA-seq data is the ability to cluster similar or same cells into non-overlapping groups. Many competing methods have been proposed for this task, but there is currently little guidance with regards to which method to use.
Methods: Here we use one gold standard 10x Genomics dataset, generated from the mixture of three cell lines, as well as three silver standard 10x Genomics datasets generated from peripheral blood mononuclear cells to examine not only the accuracy but also robustness of a dozen methods.
Results: We found that some methods, including Seurat and Cell Ranger, outperform other methods, although performance seems to be dependent on the complexity of the studied system. Furthermore, we found that solutions produced by different methods have little in common with each other.
Conclusions: In light of this, we conclude that the choice of clustering tool crucially determines interpretation of scRNA-seq data generated by 10x Genomics. Hence practitioners and consumers should remain vigilant about the outcome of 10x Genomics scRNA-seq analysis.
Identification of genomic regions that are identical by descent (IBD) has proven useful for human genetic studies where analyses have led to the discovery of familial relatedness and fine-mapping of ...disease critical regions. Unfortunately however, IBD analyses have been underutilized in analysis of other organisms, including human pathogens. This is in part due to the lack of statistical methodologies for non-diploid genomes in addition to the added complexity of multiclonal infections. As such, we have developed an IBD methodology, called isoRelate, for analysis of haploid recombining microorganisms in the presence of multiclonal infections. Using the inferred IBD status at genomic locations, we have also developed a novel statistic for identifying loci under positive selection and propose relatedness networks as a means of exploring shared haplotypes within populations. We evaluate the performance of our methodologies for detecting IBD and selection, including comparisons with existing tools, then perform an exploratory analysis of whole genome sequencing data from a global Plasmodium falciparum dataset of more than 2500 genomes. This analysis identifies Southeast Asia as having many highly related isolates, possibly as a result of both reduced transmission from intensified control efforts and population bottlenecks following the emergence of antimalarial drug resistance. Many signals of selection are also identified, most of which overlap genes that are known to be associated with drug resistance, in addition to two novel signals observed in multiple countries that have yet to be explored in detail. Additionally, we investigate relatedness networks over the selected loci and determine that one of these sweeps has spread between continents while the other has arisen independently in different countries. IBD analysis of microorganisms using isoRelate can be used for exploring population structure, positive selection and haplotype distributions, and will be a valuable tool for monitoring disease control and elimination efforts of many diseases.
Short tandem repeats (STRs) are highly informative genetic markers that have been used extensively in population genetics analysis. They are an important source of genetic diversity and can also have ...functional impact. Despite the availability of bioinformatic methods that permit large-scale genome-wide genotyping of STRs from whole genome sequencing data, they have not previously been applied to sequencing data from large collections of malaria parasite field samples. Here, we have genotyped STRs using HipSTR in more than 3,000 Plasmodium falciparum and 174 Plasmodium vivax published whole-genome sequence data from samples collected across the globe. High levels of noise and variability in the resultant callset necessitated the development of a novel method for quality control of STR genotype calls. A set of high-quality STR loci (6,768 from P. falciparum and 3,496 from P. vivax) were used to study Plasmodium genetic diversity, population structures and genomic signatures of selection and these were compared to genome-wide single nucleotide polymorphism (SNP) genotyping data. In addition, the genome-wide information about genetic variation and other characteristics of STRs in P. falciparum and P. vivax have been available in an interactive web-based R Shiny application PlasmoSTR (https://github.com/bahlolab/PlasmoSTR).
Benign adult familial myoclonic epilepsy type 1 (BAFME1) in several Japanese and Chinese families has recently been found to be caused by pentanucleotide repeat expansions in SAMD12. We identified a ...Thai family with six members affected with BAFME. Microsatellite studies suggested a linkage to the BAFME1 region on chromosome 8q24. Subsequently, long-read whole-genome sequencing showed the (TTTTA)
(TTTCA)
in intron 4 of SAMD12 in an affected member. Repeat-primed PCR and long-range PCR revealed that the pentanucleotide repeat expansions segregated with the disease status. Our Thai family is the first non-Japanese and non-Chinese family with BAFME1. SNP array showed that the aberrant repeats had the same haplotype as those previously determined in Japanese and Chinese patients suggesting a common ancestry. The variant is estimated to arise ~12,000 years ago.
Genetic investigations of people with impaired development of spoken language provide windows into key aspects of human biology. Over 15 years after FOXP2 was identified, most speech and language ...impairments remain unexplained at the molecular level. We sequenced whole genomes of nineteen unrelated individuals diagnosed with childhood apraxia of speech, a rare disorder enriched for causative mutations of large effect. Where DNA was available from unaffected parents, we discovered de novo mutations, implicating genes, including CHD3, SETD1A and WDR5. In other probands, we identified novel loss-of-function variants affecting KAT6A, SETBP1, ZFHX4, TNRC6B and MKL2, regulatory genes with links to neurodevelopment. Several of the new candidates interact with each other or with known speech-related genes. Moreover, they show significant clustering within a single co-expression module of genes highly expressed during early human brain development. This study highlights gene regulatory pathways in the developing brain that may contribute to acquisition of proficient speech.
We performed genomic mapping of a family with autosomal dominant nocturnal frontal lobe epilepsy (ADNFLE) and intellectual and psychiatric problems, identifying a disease-associated region on ...chromosome 9q34.3. Whole-exome sequencing identified a mutation in KCNT1, encoding a sodium-gated potassium channel subunit. KCNT1 mutations were identified in two additional families and a sporadic case with severe ADNFLE and psychiatric features. These findings implicate the sodium-gated potassium channel complex in ADNFLE and, more broadly, in the pathogenesis of focal epilepsies.
Repeat expansions are responsible for over 40 monogenic disorders, and undoubtedly more pathogenic repeat expansions remain to be discovered. Existing methods for detecting repeat expansions in ...short-read sequencing data require predefined repeat catalogs. Recent discoveries emphasize the need for methods that do not require pre-specified candidate repeats. To address this need, we introduce ExpansionHunter Denovo, an efficient catalog-free method for genome-wide repeat expansion detection. Analysis of real and simulated data shows that our method can identify large expansions of 41 out of 44 pathogenic repeats, including nine recently reported non-reference repeat expansions not discoverable via existing methods.