ABSTRACT
We have developed a tool for detecting single exon copy‐number variations (CNVs) in targeted next‐generation sequencing data: CoNVaDING (Copy Number Variation Detection In Next‐generation ...sequencing Gene panels). CoNVaDING includes a stringent quality control (QC) metric, that excludes or flags low‐quality exons. Since this QC shows exactly which exons can be reliably analyzed and which exons are in need of an alternative analysis method, CoNVaDING is not only useful for CNV detection in a research setting, but also in clinical diagnostics. During the validation phase, CoNVaDING detected all known CNVs in high‐quality targets in 320 samples analyzed, giving 100% sensitivity and 99.998% specificity for 308,574 exons. CoNVaDING outperforms existing tools by exhibiting a higher sensitivity and specificity and by precisely identifying low‐quality samples and regions.
We have developed a tool for detecting single exon copy number variations (CNVs) in targeted next‐generation sequencing data: CoNVaDING (Copy Number Variation Detection In Next‐generation sequencing Gene panels). CoNVaDING includes a stringent quality control metric, that excludes or flags low quality exons. Since this quality control shows exactly which exons can be reliably analysed and which exons are in need of an alternative analysis method, CoNVaDING is also useful for CNV detection in clinical diagnostics.
Summary
Minimal residual disease (MRD) diagnostics are implemented in most clinical protocols for patients with acute lymphoblastic leukaemia (ALL) and are mostly performed using rearranged ...immunoglobulin (IG) and/or T‐cell receptor (TR) gene rearrangements as molecular polymerase chain reaction targets. Unfortunately, in 5–10% of patients no or no sensitive IG/TR targets are available, and patients therefore cannot be stratified appropriately. In the present study, we used fusion genes and genomic deletions as alternative MRD targets in these patients, which retrospectively revealed appropriate MDR stratification in 79% of patients with no (sensitive) IG/TR target, and a different risk group stratification in more than half of the cases.
Large-scale population sequencing studies provide a complete picture of human genetic variation within the studied populations. A key challenge is to identify, among the myriad alleles, those ...variants that have an effect on molecular function, phenotypes, and reproductive fitness. Most non-neutral variation consists of deleterious alleles segregating at low population frequency due to incessant mutation. To date, studies characterizing selection against deleterious alleles have been based on allele frequency (testing for a relative excess of rare alleles) or ratio of polymorphism to divergence (testing for a relative increase in the number of polymorphic alleles). Here, starting from Maruyama's theoretical prediction (Maruyama T (1974), Am J Hum Genet USA 6:669-673) that a (slightly) deleterious allele is, on average, younger than a neutral allele segregating at the same frequency, we devised an approach to characterize selection based on allelic age. Unlike existing methods, it compares sets of neutral and deleterious sequence variants at the same allele frequency. When applied to human sequence data from the Genome of the Netherlands Project, our approach distinguishes low-frequency coding non-synonymous variants from synonymous and non-coding variants at the same allele frequency and discriminates between sets of variants independently predicted to be benign or damaging for protein structure and function. The results confirm the abundance of slightly deleterious coding variation in humans.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Objective:
To identify the causative gene for the neurodegenerative disorder spinocerebellar ataxia type 19 (SCA19) located on chromosomal region 1p21‐q21.
Methods:
Exome sequencing was used to ...identify the causal mutation in a large SCA19 family. We then screened 230 ataxia families for mutations located in the same gene (KCND3, also known as Kv4.3) using high‐resolution melting. SCA19 brain autopsy material was evaluated, and in vitro experiments using ectopic expression of wild‐type and mutant Kv4.3 were used to study protein localization, stability, and channel activity by patch‐clamping.
Results:
We detected a T352P mutation in the third extracellular loop of the voltage‐gated potassium channel KCND3 that cosegregated with the disease phenotype in our original family. We identified 2 more novel missense mutations in the channel pore (M373I) and the S6 transmembrane domain (S390N) in 2 other ataxia families. T352P cerebellar autopsy material showed severe Purkinje cell degeneration, with abnormal intracellular accumulation and reduced protein levels of Kv4.3 in their soma. Ectopic expression of all mutant proteins in HeLa cells revealed retention in the endoplasmic reticulum and enhanced protein instability, in contrast to wild‐type Kv4.3 that was localized on the plasma membrane. The regulatory β subunit Kv channel interacting protein 2 was able to rescue the membrane localization and the stability of 2 of the 3 mutant Kv4.3 complexes. However, this either did not restore the channel function of the membrane‐located mutant Kv4.3 complexes or restored it only partially.
Interpretation:
KCND3 mutations cause SCA19 by impaired protein maturation and/or reduced channel function. ANN NEUROL 2012;72:870–880
The LifeLines Cohort Study is a large population-based cohort study and biobank that was established as a resource for research on complex interactions between environmental, phenotypic and genomic ...factors in the development of chronic diseases and healthy ageing. Between 2006 and 2013, inhabitants of the northern part of The Netherlands and their families were invited to participate, thereby contributing to a three-generation design. Participants visited one of the LifeLines research sites for a physical examination, including lung function, ECG and cognition tests, and completed extensive questionnaires. Baseline data were collected for 167 729 participants, aged from 6 months to 93 years. Follow-up visits are scheduled every 5 years, and in between participants receive follow-up questionnaires. Linkage is being established with medical registries and environmental data. LifeLines contains information on biochemistry, medical history, psychosocial characteristics, lifestyle and more. Genomic data are available including genome-wide genetic data of 15 638 participants. Fasting blood and 24-h urine samples are processed on the day of collection and stored at -80 °C in a fully automated storage facility. The aim of LifeLines is to be a resource for the national and international scientific community. Requests for data and biomaterials can be submitted to the LifeLines Research Office LLscience@umcg.nl.
Allele specific expression (ASE) concerns divergent expression quantity of alternative alleles and is measured by RNA sequencing. Multiple studies show that ASE plays a role in hereditary diseases by ...modulating penetrance or phenotype severity. However, genome diagnostics is based on DNA sequencing and therefore neglects gene expression regulation such as ASE. To take advantage of ASE in absence of RNA sequencing, it must be predicted using only DNA variation. We have constructed ASE models from BIOS (n = 3432) and GTEx (n = 369) that predict ASE using DNA features. These models are highly reproducible and comprise many different feature types, highlighting the complex regulation that underlies ASE. We applied the BIOS-trained model to population variants in three genes in which ASE plays a clinically relevant role: BRCA2, RET and NF1. This resulted in predicted ASE effects for 27 variants, of which 10 were known pathogenic variants. We demonstrated that ASE can be predicted from DNA features using machine learning. Future efforts may improve sensitivity and translate these models into a new type of genome diagnostic tool that prioritizes candidate pathogenic variants or regulators thereof for follow-up validation by RNA sequencing. All used code and machine learning models are available at GitHub and Zenodo.
Serum hepcidin concentration is regulated by iron status, inflammation, erythropoiesis and numerous other factors, but underlying processes are incompletely understood. We studied the association of ...common and rare single nucleotide variants (SNVs) with serum hepcidin in one Italian study and two large Dutch population-based studies. We genotyped common SNVs with genome-wide association study (GWAS) arrays and subsequently performed imputation using the 1000 Genomes reference panel. Cohort-specific GWAS were performed for log-transformed serum hepcidin, adjusted for age and gender, and results were combined in a fixed-effects meta-analysis (total N 6,096). Six top SNVs (p<5x10-6) were genotyped in 3,821 additional samples, but associations were not replicated. Furthermore, we meta-analyzed cohort-specific exome array association results of rare SNVs with serum hepcidin that were available for two of the three cohorts (total N 3,226), but no exome-wide significant signal (p<1.4x10-6) was identified. Gene-based meta-analyses revealed 19 genes that showed significant association with hepcidin. Our results suggest the absence of common SNVs and rare exonic SNVs explaining a large proportion of phenotypic variation in serum hepcidin. We recommend extension of our study once additional substantial cohorts with hepcidin measurements, GWAS and/or exome array data become available in order to increase power to identify variants that explain a smaller proportion of hepcidin variation. In addition, we encourage follow-up of the potentially interesting genes that resulted from the gene-based analysis of low-frequency and rare variants.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Although genome-wide association studies (GWAS) have identified many common variants associated with complex traits, low-frequency and rare variants have not been interrogated in a comprehensive ...manner. Imputation from dense reference panels, such as the 1000 Genomes Project (1000G), enables testing of ungenotyped variants for association. Here we present the results of imputation using a large, new population-specific panel: the Genome of The Netherlands (GoNL). We benchmarked the performance of the 1000G and GoNL reference sets by comparing imputation genotypes with 'true' genotypes typed on ImmunoChip in three European populations (Dutch, British, and Italian). GoNL showed significant improvement in the imputation quality for rare variants (MAF 0.05-0.5%) compared with 1000G. In Dutch samples, the mean observed Pearson correlation, r(2), increased from 0.61 to 0.71. We also saw improved imputation accuracy for other European populations (in the British samples, r(2) improved from 0.58 to 0.65, and in the Italians from 0.43 to 0.47). A combined reference set comprising 1000G and GoNL improved the imputation of rare variants even further. The Italian samples benefitted the most from this combined reference (the mean r(2) increased from 0.47 to 0.50). We conclude that the creation of a large population-specific reference is advantageous for imputing rare variants and that a combined reference panel across multiple populations yields the best imputation results.
Despite an explosive growth of next‐generation sequencing data, genome diagnostics only provides a molecular diagnosis to a minority of patients. Software tools that prioritize genes based on patient ...symptoms using known gene‐disease associations may complement variant filtering and interpretation to increase chances of success. However, many of these tools cannot be used in practice because they are embedded within variant prioritization algorithms, or exist as remote services that cannot be relied upon or are unacceptable because of legal/ethical barriers. In addition, many tools are not designed for command‐line usage, closed‐source, abandoned, or unavailable. We present Variant Interpretation using Biomedical literature Evidence (VIBE), a tool to prioritize disease genes based on Human Phenotype Ontology codes. VIBE is a locally installed executable that ensures operational availability and is built upon DisGeNET‐RDF, a comprehensive knowledge platform containing gene‐disease associations mostly from literature and variant‐disease associations mostly from curated source databases. VIBE's command‐line interface and output are designed for easy incorporation into bioinformatic pipelines that annotate and prioritize variants for further clinical interpretation. We evaluate VIBE in a benchmark based on 305 patient cases alongside seven other tools. Our results demonstrate that VIBE offers consistent performance with few cases missed, but we also find high complementarity among all tested tools. VIBE is a powerful, free, open source and locally installable solution for prioritizing genes based on patient symptoms. Project source code, documentation, benchmark and executables are available at https://github.com/molgenis/vibe.
Gene prioritization tool output and causal gene rank for all patient cases. Each dot represents a patient case (ie, set of Human Phenotype Ontology codes) for which the causal gene was prioritized by one of eight benchmarked tools. Shown are the absolute ranks of the causal genes vs the total number of candidate genes returned by a tool. The colored labels indicate which dot belongs to which tool, as well as show the number of missed genes for each tool, where the causal gene was not present in the output gene list.