Likely pathogenic/pathogenic variants in genes encoding desmosomal proteins play an important role in the pathophysiology of arrhythmogenic right ventricular cardiomyopathy (ARVC). However, for a ...substantial proportion of ARVC patients, the genetic substrate remains unknown. We hypothesized that plectin, a cytolinker protein encoded by the PLEC gene, could play a role in ARVC because it has been proposed to link the desmosomal protein desmoplakin to the cytoskeleton and therefore has a potential function in the desmosomal structure.
We screened PLEC in 359 ARVC patients and compared the frequency of rare coding PLEC variants (minor allele frequency MAF <0.001) between patients and controls. To assess the frequency of rare variants in the control population, we evaluated the rare coding variants (MAF <0.001) found in the European cohort of the Exome Aggregation Database. We further evaluated plectin localization by immunofluorescence in a subset of patients with and without a PLEC variant.
Forty ARVC patients carried one or more rare PLEC variants (11%, 40/359). However, rare variants also seem to occur frequently in the control population (18%, 4754/26197 individuals). Nor did we find a difference in the prevalence of rare PLEC variants in ARVC patients with or without a desmosomal likely pathogenic/pathogenic variant (14% versus 8%, respectively). However, immunofluorescence analysis did show decreased plectin junctional localization in myocardial tissue from 5 ARVC patients with PLEC variants.
Although PLEC has been hypothesized as a promising candidate gene for ARVC, our current study did not show an enrichment of rare PLEC variants in ARVC patients compared to controls and therefore does not support a major role for PLEC in this disorder. Although rare PLEC variants were associated with abnormal localization in cardiac tissue, the confluence of data does not support a role for plectin abnormalities in ARVC development.
Ontologies have become an essential asset in the bioinformatics toolbox and a number of ontology access resources are now available, for example, the EBI Ontology Lookup Service (OLS) and the NCBO ...BioPortal. However, these resources differ substantially in mode, ease of access, and ontology content. This makes it relatively difficult to access each ontology source separately, map their contents to research data, and much of this effort is being replicated across different research groups.
OntoCAT provides a seamless programming interface to query heterogeneous ontology resources including OLS and BioPortal, as well as user-specified local OWL and OBO files. Each resource is wrapped behind easy to learn Java, Bioconductor/R and REST web service commands enabling reuse and integration of ontology software efforts despite variation in technologies. It is also available as a stand-alone MOLGENIS database and a Google App Engine application.
OntoCAT provides a robust, configurable solution for accessing ontology terms specified locally and from remote services, is available as a stand-alone tool and has been tested thoroughly in the ArrayExpress, MOLGENIS, EFO and Gen2Phen phenotype use cases.
http://www.ontocat.org.
Genetic markers and maps are instrumental in quantitative trait locus (QTL) mapping in segregating populations. The resolution of QTL localization depends on the number of informative recombinations ...in the population and how well they are tagged by markers. Larger populations and denser marker maps are better for detecting and locating QTLs. Marker maps that are initially too sparse can be saturated or derived de novo from high-throughput omics data, (e.g. gene expression, protein or metabolite abundance). If these molecular phenotypes are affected by genetic variation due to a major QTL they will show a clear multimodal distribution. Using this information, phenotypes can be converted into genetic markers.
The Pheno2Geno tool uses mixture modeling to select phenotypes and transform them into genetic markers suitable for construction and/or saturation of a genetic map. Pheno2Geno excludes candidate genetic markers that show evidence for multiple possibly epistatically interacting QTL and/or interaction with the environment, in order to provide a set of robust markers for follow-up QTL mapping. We demonstrate the use of Pheno2Geno on gene expression data of 370,000 probes in 148 A. thaliana recombinant inbred lines. Pheno2Geno is able to saturate the existing genetic map, decreasing the average distance between markers from 7.1 cM to 0.89 cM, close to the theoretical limit of 0.68 cM (with 148 individuals we expect a recombination every 100/148=0.68 cM); this pinpointed almost all of the informative recombinations in the population.
The Pheno2Geno package makes use of genome-wide molecular profiling and provides a tool for high-throughput de novo map construction and saturation of existing genetic maps. Processing of the showcase dataset takes less than 30 minutes on an average desktop PC. Pheno2Geno improves QTL mapping results at no additional laboratory cost and with minimum computational effort. Its results are formatted for direct use in R/qtl, the leading R package for QTL studies. Pheno2Geno is freely available on CRAN under "GNU GPL v3". The Pheno2Geno package as well as the tutorial can also be found at: http://pheno2geno.nl .
While the size and number of biobanks, patient registries and other data collections are increasing, biomedical researchers still often need to pool data for statistical power, a task that requires ...time-intensive retrospective integration.
To address this challenge, we developed MOLGENIS/connect, a semi-automatic system to find, match and pool data from different sources. The system shortlists relevant source attributes from thousands of candidates using ontology-based query expansion to overcome variations in terminology. Then it generates algorithms that transform source attributes to a common target DataSchema. These include unit conversion, categorical value matching and complex conversion patterns (e.g. calculation of BMI). In comparison to human-experts, MOLGENIS/connect was able to auto-generate 27% of the algorithms perfectly, with an additional 46% needing only minor editing, representing a reduction in the human effort and expertise needed to pool data.
Source code, binaries and documentation are available as open-source under LGPLv3 from http://github.com/molgenis/molgenis and www.molgenis.org/connect
: m.a.swertz@rug.nl
Supplementary data are available at Bioinformatics online.
Genetic disorders are a substantial cause of infant morbidity and mortality and are frequently suspected in neonatal intensive care units. Non-specific clinical presentation or limitations to ...physical examination can result in a plethora of genetic testing techniques, without clear strategies on test ordering. Here, we review our 2-years experiences of rapid genetic testing of NICU patients in order to provide such recommendations.
We retrospectively included all patients admitted to the NICU who received clinical genetic consultation and genetic testing in our University hospital. We documented reasons for referral for genetic consultation, presenting phenotypes, differential diagnoses, genetic testing requested and their outcomes, as well as the consequences of each (rapid) genetic diagnostic approach. We calculated diagnostic yield and turnaround times (TATs).
Of 171 included infants that received genetic consultation 140 underwent genetic testing. As a result of testing as first tier, 13/14 patients received a genetic diagnosis from QF-PCR; 14/115 from SNP-array; 12/89 from NGS testing, of whom 4/46 were diagnosed with a small gene panel and 8/43 with a large OMIM-morbid based gene panel. Subsequent secondary or tertiary analysis and/or additional testing resulted in five more diagnoses. TATs ranged from 1 day (QF-PCR) to a median of 14 for NGS and SNP-array testing, with increasing TAT in particular when many consecutive tests were performed. Incidental findings were detected in 5/140 tested patients (3.6%).
We recommend implementing a broad NGS gene panel in combination with CNV calling as the first tier of genetic testing for NICU patients given the often unspecific phenotypes of ill infants and the high yield of this large panel.
xQTL workbench is a scalable web platform for the mapping of quantitative trait loci (QTLs) at multiple levels: for example gene expression (eQTL), protein abundance (pQTL), metabolite abundance ...(mQTL) and phenotype (phQTL) data. Popular QTL mapping methods for model organism and human populations are accessible via the web user interface. Large calculations scale easily on to multi-core computers, clusters and Cloud. All data involved can be uploaded and queried online: markers, genotypes, microarrays, NGS, LC-MS, GC-MS, NMR, etc. When new data types come available, xQTL workbench is quickly customized using the Molgenis software generator.
xQTL workbench runs on all common platforms, including Linux, Mac OS X and Windows. An online demo system, installation guide, tutorials, software and source code are available under the LGPL3 license from http://www.xqtl.org.
m.a.swertz@rug.nl.
RNA-sequencing (RNA-seq) is a powerful technique for the identification of genetic variants that affect gene-expression levels, either through expression quantitative trait locus (eQTL) mapping or ...through allele-specific expression (ASE) analysis. Given increasing numbers of RNA-seq samples in the public domain, we here studied to what extent eQTLs and ASE effects can be identified when using public RNA-seq data while deriving the genotypes from the RNA-sequencing reads themselves.
We downloaded the raw reads for all available human RNA-seq datasets. Using these reads we performed gene expression quantification. All samples were jointly normalized and subjected to a strict quality control. We also derived genotypes using the RNA-seq reads and used imputation to infer non-coding variants. This allowed us to perform eQTL mapping and ASE analyses jointly on all samples that passed quality control. Our results were validated using samples for which DNA-seq genotypes were available.
4,978 public human RNA-seq runs, representing many different tissues and cell-types, passed quality control. Even though these data originated from many different laboratories, samples reflecting the same cell type clustered together, suggesting that technical biases due to different sequencing protocols are limited. In a joint analysis on the 1,262 samples with high quality genotypes, we identified cis-eQTLs effects for 8,034 unique genes (at a false discovery rate ≤0.05). eQTL mapping on individual tissues revealed that a limited number of samples already suffice to identify tissue-specific eQTLs for known disease-associated genetic variants. Additionally, we observed strong ASE effects for 34 rare pathogenic variants, corroborating previously observed effects on the corresponding protein levels.
By deriving and imputing genotypes from RNA-seq data, it is possible to identify both eQTLs and ASE effects. Given the exponential growth of the number of publicly available RNA-seq samples, we expect this approach will become especially relevant for studying the effects of tissue-specific and rare pathogenic genetic variants to aid clinical interpretation of exome and genome sequencing.
Alternative splicing is considered a major mechanism for creating multicellular diversity from a limited repertoire of genes. Here, we performed the first study of genetic variation controlling ...alternative splicing patterns by comprehensively identifying quantitative trait loci affecting the differential expression of transcript isoforms in a large recombinant inbred population of Caenorhabditis elegans, using a new generation of whole-genome very-high-density oligonucleotide microarrays. Using 60 experimental lines, we were able to detect 435 genes with substantial heritable variation, of which 36% were regulated at a distance (in trans). Nonetheless, we find only a very small number of examples of heritable variation in alternative splicing (22 transcripts), and most of these genes colocalize with the associated genomic loci. Our findings suggest that the regulatory mechanism of alternative splicing in C. elegans is robust toward genetic variation at the genome-wide scale, which is in striking contrast to earlier observations in humans.
Objective: This paper reports on the development of a dynamic data management planning questionnaire to guide data stewards of the European Reference Network (ERN) rare disease patient registries to ...make their data findable, accessible, interoperable, and reusable (FAIR). As part of this work, the questionnaire was validated through expert review and aligned with existing resources on rare diseases and FAIR data management. Materials and Methods: The questionnaire was developed for the Data Stewardship Wizard, a tool for data management planning. Knowledge sources on FAIR data, ERN patient registries, and data management were used to compose questions. Ten domain experts validated the questionnaire. The topics in the questionnaire were aligned with existing knowledge bases. Results: A total of 57 questions were included in the questionnaire. Twenty-three references to the FAIR Cookbook and Research Data Management toolkit for Life Sciences were added. Expert validation provided a total of 166 comments on content, structure, and software-related issues. A public instance of the Data Stewardship Wizard was deployed for use by data stewards of ERN patient registries. Discussion: The questionnaire addresses issues that ERNs encounter when making their registries FAIR and follows the implementation choices made by the European rare disease community. A challenging task for future research is to extend the questionnaire to other types of registries and to validate with users. Conclusion: This smart questionnaire is the first model created for the Data Stewardship Wizard that helps ERN patient registries with making their data FAIR. It will assist data stewards in aligning their efforts and providing guidance on FAIR data.
Here, we present WormQTL (http://www.wormqtl.org), an easily accessible database enabling search, comparative analysis and meta-analysis of all data on variation in Caenorhabditis spp. Over the past ...decade, Caenorhabditis elegans has become instrumental for molecular quantitative genetics and the systems biology of natural variation. These efforts have resulted in a valuable amount of phenotypic, high-throughput molecular and genotypic data across different developmental worm stages and environments in hundreds of C. elegans strains. WormQTL provides a workbench of analysis tools for genotype-phenotype linkage and association mapping based on but not limited to R/qtl (http://www.rqtl.org). All data can be uploaded and downloaded using simple delimited text or Excel formats and are accessible via a public web user interface for biologists and R statistic and web service interfaces for bioinformaticians, based on open source MOLGENIS and xQTL workbench software. WormQTL welcomes data submissions from other worm researchers.