Exome sequencing is now mainstream in clinical practice. However, identification of pathogenic Mendelian variants remains time-consuming, in part, because the limited accuracy of current ...computational prediction methods requires manual classification by experts. Here we introduce CAPICE, a new machine-learning-based method for prioritizing pathogenic variants, including SNVs and short InDels. CAPICE outperforms the best general (CADD, GAVIN) and consequence-type-specific (REVEL, ClinPred) computational prediction methods, for both rare and ultra-rare variants. CAPICE is easily added to diagnostic pipelines as pre-computed score file or command-line software, or using online MOLGENIS web service with API. Download CAPICE for free and open-source (LGPLv3) at Keywords: Variant pathogenicity prediction, Machine learning, Exome sequencing, Molecular consequence, Allele frequency, Clinical genetics, Genome diagnostics
In this study, we compare next-generation sequencing (NGS) approaches (targeted panel (tNGS), whole exome sequencing (WES), and whole genome sequencing (WGS)) for application in newborn screening ...(NBS). DNA was extracted from dried blood spots (DBS) from 50 patients with genetically confirmed inherited metabolic disorders (IMDs) and 50 control samples. One hundred IMD-related genes were analyzed. Two data-filtering strategies were applied: one to detect only (likely) pathogenic ((L)P) variants, and one to detect (L)P variants in combination with variants of unknown significance (VUS). The variants were filtered and interpreted, defining true/false positives (TP/FP) and true/false negatives (TN/FN). The variant filtering strategies were assessed in a background cohort (BC) of 4833 individuals. Reliable results were obtained within 5 days. TP results (47 patient samples) for tNGS, WES, and WGS results were 33, 31, and 30, respectively, using the (L)P filtering, and 40, 40, and 38, respectively, when including VUS. FN results were 11, 13, and 14, respectively, excluding VUS, and 4, 4, and 6, when including VUS. The remaining FN were mainly samples with a homozygous VUS. All controls were TN. Three BC individuals showed a homozygous (L)P variant, all related to a variable, mild phenotype. The use of NGS-based workflows in NBS seems promising, although more knowledge of data handling, automated variant interpretation, and costs is needed before implementation.
Abstract
Motivation
The volume and complexity of biological data increases rapidly. Many clinical professionals and biomedical researchers without a bioinformatics background are generating big ...'-omics' data, but do not always have the tools to manage, process or publicly share these data.
Results
Here we present MOLGENIS Research, an open-source web-application to collect, manage, analyze, visualize and share large and complex biomedical datasets, without the need for advanced bioinformatics skills.
Availability and implementation
MOLGENIS Research is freely available (open source software). It can be installed from source code (see http://github.com/molgenis), downloaded as a precompiled WAR file (for your own server), setup inside a Docker container (see http://molgenis.github.io), or requested as a Software-as-a-Service subscription. For a public demo instance and complete installation instructions see http://molgenis.org/research.
ABSTRACT
Arrhythmogenic cardiomyopathy (ACM) is an inherited cardiac disease characterized by myocardial atrophy, fibro‐fatty replacement, and a high risk of ventricular arrhythmias that lead to ...sudden death. In 2009, genetic data from 57 publications were collected in the arrhythmogenic right ventricular dysplasia/cardiomyopathy (ARVD/C) Genetic Variants Database (freeware available at http://www.arvcdatabase.info), which comprised 481 variants in eight ACM‐associated genes. In recent years, deep genetic sequencing has increased our knowledge of the genetics of ACM, revealing a large spectrum of nucleotide variations for which pathogenicity needs to be assessed. As of April 20, 2014, we have updated the ARVD/C database into the ARVD/C database to contain more than 1,400 variants in 12 ACM‐related genes (PKP2, DSP, DSC2, DSG2, JUP, TGFB3, TMEM43, LMNA, DES, TTN, PLN, CTNNA3) as reported in more than 160 references. Of these, only 411 nucleotide variants have been reported as pathogenic, whereas the significance of the other approximately 1,000 variants is still unknown. This comprehensive collection of ACM genetic data represents a valuable source of information on the spectrum of ACM‐associated genes and aims to facilitate the interpretation of genetic data and genetic counseling.
The updated www.arvcdatabase.info is a valuable and freely accessible source of information for doctors, researchers, and patients, containing clinical and genetic data on all genes associated with arrhythmogenic cardiomyopathy. The 2014 update contains more than 1,400 variants in 12 genes (PKP2, DSP, DSC2, DSG2, JUP, TGFB3, TMEM43, LMNA, DES, TTN, PLN, CTNNA3), reported in more than 160 references. Of these, 411 nucleotide variants have been reported as pathogenic.
Each year diagnostic laboratories in the Netherlands profile thousands of individuals for heritable disease using next‐generation sequencing (NGS). This requires pathogenicity classification of ...millions of DNA variants on the standard 5‐tier scale. To reduce time spent on data interpretation and increase data quality and reliability, the nine Dutch labs decided to publicly share their classifications. Variant classifications of nearly 100,000 unique variants were catalogued and compared in a centralized MOLGENIS database. Variants classified by more than one center were labeled as “consensus” when classifications agreed, and shared internationally with LOVD and ClinVar. When classifications opposed (LB/B vs. LP/P), they were labeled “conflicting”, while other nonconsensus observations were labeled “no consensus”. We assessed our classifications using the InterVar software to compare to ACMG 2015 guidelines, showing 99.7% overall consistency with only 0.3% discrepancies. Differences in classifications between Dutch labs or between Dutch labs and ACMG were mainly present in genes with low penetrance or for late onset disorders and highlight limitations of the current 5‐tier classification system. The data sharing boosted the quality of DNA diagnostics in Dutch labs, an initiative we hope will be followed internationally. Recently, a positive match with a case from outside our consortium resulted in a more definite disease diagnosis.
Biobanks are indispensable for large-scale genetic/epidemiological studies, yet it remains difficult for researchers to determine which biobanks contain data matching their research questions.
To ...overcome this, we developed a new matching algorithm that identifies pairs of related data elements between biobanks and research variables with high precision and recall. It integrates lexical comparison, Unified Medical Language System ontology tagging and semantic query expansion. The result is BiobankUniverse, a fast matchmaking service for biobanks and researchers. Biobankers upload their data elements and researchers their desired study variables, BiobankUniverse automatically shortlists matching attributes between them. Users can quickly explore matching potential and search for biobanks/data elements matching their research. They can also curate matches and define personalized data-universes.
BiobankUniverse is available at http://biobankuniverse.com or can be downloaded as part of the open source MOLGENIS suite at http://github.com/molgenis/molgenis.
m.a.swertz@rug.nl.
Supplementary data are available at Bioinformatics online.
While the size and number of biobanks, patient registries and other data collections are increasing, biomedical researchers still often need to pool data for statistical power, a task that requires ...time-intensive retrospective integration.
To address this challenge, we developed MOLGENIS/connect, a semi-automatic system to find, match and pool data from different sources. The system shortlists relevant source attributes from thousands of candidates using ontology-based query expansion to overcome variations in terminology. Then it generates algorithms that transform source attributes to a common target DataSchema. These include unit conversion, categorical value matching and complex conversion patterns (e.g. calculation of BMI). In comparison to human-experts, MOLGENIS/connect was able to auto-generate 27% of the algorithms perfectly, with an additional 46% needing only minor editing, representing a reduction in the human effort and expertise needed to pool data.
Source code, binaries and documentation are available as open-source under LGPLv3 from http://github.com/molgenis/molgenis and www.molgenis.org/connect
: m.a.swertz@rug.nl
Supplementary data are available at Bioinformatics online.
There is an urgent need to standardize the semantics of biomedical data values, such as phenotypes, to enable comparative and integrative analyses. However, it is unlikely that all studies will use ...the same data collection protocols. As a result, retrospective standardization is often required, which involves matching of original (unstructured or locally coded) data to widely used coding or ontology systems such as SNOMED CT (clinical terms), ICD-10 (International Classification of Disease) and HPO (Human Phenotype Ontology). This data curation process is usually a time-consuming process performed by a human expert. To help mechanize this process, we have developed SORTA, a computer-aided system for rapidly encoding free text or locally coded values to a formal coding system or ontology. SORTA matches original data values (uploaded in semicolon delimited format) to a target coding system (uploaded in Excel spreadsheet, OWL ontology web language or OBO open biomedical ontologies format). It then semi- automatically shortlists candidate codes for each data value using Lucene and n-gram based matching algorithms, and can also learn from matches chosen by human experts. We evaluated SORTA's applicability in two use cases. For the LifeLines biobank, we used SORTA to recode 90 000 free text values (including 5211 unique values) about physical exercise to MET (Metabolic Equivalent of Task) codes. For the CINEAS clinical symptom coding system, we used SORTA to map to HPO, enriching HPO when necessary (315 terms matched so far). Out of the shortlists at rank 1, we found a precision/recall of 0.97/0.98 in LifeLines and of 0.58/0.45 in CINEAS. More importantly, users found the tool both a major time saver and a quality improvement because SORTA reduced the chances of human mistakes. Thus, SORTA can dramatically ease data (re)coding tasks and we believe it will prove useful for many more projects. Database URL: http://molgenis.org/sorta or as an open source download from http://www.molgenis.org/wiki/SORTA.
Evidence is accumulating that, similar to other ventral hernias, umbilical and epigastric hernias must be mesh repaired. The difficulties involved in mesh placement and in mesh-related complications ...could be the reason many small abdominal hernias are still primary closed. In laparoscopic repair, a mesh is placed intraperitoneally, while the most common procedure is open surgery is pre-peritoneal mesh placement. A recently developed alternative method is the so-called patch repair, in this approach a mesh can be placed intraperitoneally through open surgery. In theory, such patches are particularly suitable for small hernias due to a reduction in the required dissection. This simple procedure is described in several studies. It is still unclear whether this new approach is associated with an equal risk of recurrence and complications compared with pre-peritoneal meshes. The material of the patch is in direct contact with intra-abdominal organs, it is unknown if this leads to more complications. On the other hand, the smaller dissection in the pre-peritoneal plane may lead to a reduction in wound complications.
346 patients suffering from an umbilical or epigastric hernia will be included in a multi-centre patient-blinded trial, comparing mesh repair with patch repair. Randomisation will take place for the two operation techniques. The two devices investigated are a flat pre-peritoneal mesh and a Proceed Ventral Patch®. Stratification will occur per centre. Post-operative evaluation will take place after 1, 3, 12 and 24 months. The number of complications requiring treatment is the primary endpoint. Secondary endpoints are Verbal Descriptor Scale (VDS) pain score and VDS cosmetic score, operation duration, recurrence and costs. An intention to treat analysis will be performed.
This trial is one of the first in its kind, to compare different mesh devices in a randomized controlled setting. The results will help to evaluate mesh repair for epigastric an umbilical hernia, and find a surgical method that minimizes the complication rate.
Netherlands Trail Registration (NTR) www.trialregister.nl 2010 NTR2514 NL33995.060.10.
Incisional hernia occurs approximately in 40% of high-risk patients after midline laparotomy. Prophylactic mesh placement has shown promising results, but long-term outcomes are needed. The present ...study aimed to assess the long-term incisional hernia rates of the previously conducted PRIMA trial with radiological follow-up.
In the PRIMA trial, patients with increased risk of incisional hernia formation (AAA or BMI ≥27 kg/m
) were randomised in a 1:2:2 ratio to primary suture, onlay mesh or sublay mesh closure in three different countries in eleven institutions. Incisional hernia during follow-up was diagnosed by any of: CT, ultrasound and physical examination, or during surgery. Assessors and patients were blinded until 2-year follow-up. Time-to-event analysis according to intention-to-treat principle was performed with the Kaplan-Meier method and Cox proportional hazard models. Trial registration: NCT00761475 (ClinicalTrials.gov).
Between 2009 and 2012, 480 patients were randomized: 107 primary suture, 188 onlay mesh and 185 sublay mesh. Five-year incisional hernia rates were 53.4% (95% CI: 40.4-64.8), 24.7% (95% CI: 12.7-38.8), 29.8% (95% CI: 17.9-42.6), respectively. Compared to primary suture, onlay mesh (HR: 0.390, 95% CI: 0.248-0.614, p < 0.001) and sublay mesh (HR: 0.485, 95% CI: 0.309-0.761, p = 0.002) were associated with a significantly lower risk of incisional hernia development.
Prophylactic mesh placement remained effective in reducing incisional hernia occurrence after midline laparotomy in high-risk patients during long-term follow-up. Hernia rates in the primary suture group were higher than previously anticipated.
B. Braun.