Single nucleotide variants represent a prevalent form of genetic variation. Mutations in the coding regions are frequently associated with the development of various genetic diseases. Computational ...tools for the prediction of the effects of mutations on protein function are very important for analysis of single nucleotide variants and their prioritization for experimental characterization. Many computational tools are already widely employed for this purpose. Unfortunately, their comparison and further improvement is hindered by large overlaps between the training datasets and benchmark datasets, which lead to biased and overly optimistic reported performances. In this study, we have constructed three independent datasets by removing all duplicities, inconsistencies and mutations previously used in the training of evaluated tools. The benchmark dataset containing over 43,000 mutations was employed for the unbiased evaluation of eight established prediction tools: MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP. The six best performing tools were combined into a consensus classifier PredictSNP, resulting into significantly improved prediction performance, and at the same time returned results for all mutations, confirming that consensus prediction represents an accurate and robust alternative to the predictions delivered by individual tools. A user-friendly web interface enables easy access to all eight prediction tools, the consensus classifier PredictSNP and annotations from the Protein Mutant Database and the UniProt database. The web server and the datasets are freely available to the academic community at http://loschmidt.chemi.muni.cz/predictsnp.
Fuchs endothelial corneal dystrophy (FECD) is a common, familial disease of the corneal endothelium and is the leading indication for corneal transplantation. Variation in the transcription factor 4 ...(TCF4) gene has been identified as a major contributor to the disease. We tested for an association between an intronic TGC trinucleotide repeat in TCF4 and FECD by determining repeat length in 66 affected participants with severe FECD and 63 participants with normal corneas in a 3-stage discovery/replication/validation study. PCR primers flanking the TGC repeat were used to amplify leukocyte-derived genomic DNA. Repeat length was determined by direct sequencing, short tandem repeat (STR) assay and Southern blotting. Genomic Southern blots were used to evaluate samples for which only a single allele was identified by STR analysis. Compiling data for 3 arms of the study, a TGC repeat length >50 was present in 79% of FECD cases and in 3% of normal controls cases (p<0.001). Among cases, 52 of 66 (79%) subjects had >50 TGC repeats, 13 (20%) had <40 repeats and 1 (2%) had an intermediate repeat length. In comparison, only 2 of 63 (3%) unaffected control subjects had >50 repeats, 60 (95%) had <40 repeats and 1 (2%) had an intermediate repeat length. The repeat length was greater than 1000 in 4 FECD cases. The sensitivity and specificity of >50 TGC repeats identifying FECD in this patient cohort was 79% and 96%, respectively Expanded TGC repeat was more specific for FECD cases than the previously identified, highly associated, single nucleotide polymorphism, rs613872 (specificity = 79%). The TGC trinucleotide repeat expansion in TCF4 is strongly associated with FECD, and a repeat length >50 is highly specific for the disease This association suggests that trinucleotide expansion may play a pathogenic role in the majority of FECD cases and is a predictor of disease risk.
Massive parallel sequencing has the potential to replace microarrays as the method for transcriptome profiling. Currently there are two protocols: full-length RNA sequencing (RNA-SEQ) and 3'-tag ...digital gene expression (DGE). In this preliminary effort, we evaluated the 3' DGE approach using two reference RNA samples from the MicroArray Quality Control Consortium (MAQC).
Using Brain RNA sample from multiple runs, we demonstrated that the transcript profiles from 3' DGE were highly reproducible between technical and biological replicates from libraries constructed by the same lab and even by different labs, and between two generations of Illumina's Genome Analyzers. Approximately 65% of all sequence reads mapped to mitochondrial genes, ribosomal RNAs, and canonical transcripts. The expression profiles of brain RNA and universal human reference RNA were compared which demonstrated that DGE was also highly quantitative with excellent correlation of differential expression with quantitative real-time PCR. Furthermore, one lane of 3' DGE sequencing, using the current sequencing chemistry and image processing software, had wider dynamic range for transcriptome profiling and was able to detect lower expressed genes which are normally below the detection threshold of microarrays.
3' tag DGE profiling with massive parallel sequencing achieved high sensitivity and reproducibility for transcriptome profiling. Although it lacks the ability of detecting alternative splicing events compared to RNA-SEQ, it is much more affordable and clearly out-performed microarrays (Affymetrix) in detecting lower abundant transcripts.
As reliable, efficient genome sequencing becomes ubiquitous, the need for similarly reliable and efficient variant calling becomes increasingly important. The Genome Analysis Toolkit (GATK), ...maintained by the Broad Institute, is currently the widely accepted standard for variant calling software. However, alternative solutions may provide faster variant calling without sacrificing accuracy. One such alternative is Sentieon DNASeq, a toolkit analogous to GATK but built on a highly optimized backend. We conducted an independent evaluation of the DNASeq single-sample variant calling pipeline in comparison to that of GATK. Our results support the near-identical accuracy of the two software packages, showcase optimal scalability and great speed from Sentieon, and describe computational performance considerations for the deployment of DNASeq.
Small intestine neuroendocrine tumors (SI-NETs) are the most common malignancy of the small bowel. Several clinical trials target PI3K/Akt/mTOR signaling; however, it is unknown whether these or ...other genes are genetically altered in these tumors. To address the underlying genetics, we analyzed 48 SI-NETs by massively parallel exome sequencing. We detected an average of 0.1 somatic single nucleotide variants (SNVs) per 106 nucleotides (range, 0-0.59), mostly transitions (C>T and A>G), which suggests that SI-NETs are stable cancers. 197 protein-altering somatic SNVs affected a preponderance of cancer genes, including FGFR2, MEN1, HOOK3, EZH2, MLF1, CARD11, VHL, NONO, and SMAD1. Integrative analysis of SNVs and somatic copy number variations identified recurrently altered mechanisms of carcinogenesis: chromatin remodeling, DNA damage, apoptosis, RAS signaling, and axon guidance. Candidate therapeutically relevant alterations were found in 35 patients, including SRC, SMAD family genes, AURKA, EGFR, HSP90, and PDGFR. Mutually exclusive amplification of AKT1 or AKT2 was the most common event in the 16 patients with alterations of PI3K/Akt/mTOR signaling. We conclude that sequencing-based analysis may provide provisional grouping of SI-NETs by therapeutic targets or deregulated pathways.
Expansion of CTG trinucleotide repeats (TNR) in the transcription factor 4 (TCF4) gene is highly associated with Fuchs Endothelial Corneal Dystrophy (FECD). Due to limitations in the availability of ...DNA from diseased corneal endothelium, sizing of CTG repeats in FECD patients has typically been determined using DNA samples isolated from peripheral blood leukocytes. However, it is non-feasible to extract enough DNA from surgically isolated FECD corneal endothelial tissue to determine repeat length based on current technology. To circumvent this issue, total RNA was isolated from FECD corneal endothelium and sequenced using long-read sequencing. Southern blotting of DNA samples isolated from primary cultures of corneal endothelium from these same affected individuals was also assessed. Both long read sequencing and Southern blot analysis showed significantly longer CTG TNR expansion (>1000 repeats) in the corneal endothelium from FECD patients than those characterized in leukocytes from the same individuals (<90 repeats). Our findings suggest that the TCF4 CTG repeat expansions in the FECD corneal endothelium are much longer than those found in leukocytes.
Fuchs endothelial corneal dystrophy (FECD) is an inherited degenerative disease that affects the internal endothelial cell monolayer of the cornea and can result in corneal edema and vision loss in ...severe cases. FECD affects ∼5% of middle-aged Caucasians in the United States and accounts for >14,000 corneal transplantations annually. Among the several genes and loci associated with FECD, the strongest association is with an intronic (CTG·CAG)n trinucleotide repeat expansion in the TCF4 gene, which is found in the majority of affected patients. Corneal endothelial cells from FECD patients harbor a poly(CUG)n RNA that can be visualized as RNA foci containing this condensed RNA and associated proteins. Similar to myotonic dystrophy type 1, the poly(CUG)n RNA co-localizes with and sequesters the mRNA-splicing factor MBNL1, leading to missplicing of essential MBNL1-regulated mRNAs. Such foci and missplicing are not observed in similar cells from FECD patients who lack the repeat expansion. RNA-Seq splicing data from the corneal endothelia of FECD patients and controls reveal hundreds of differential alternative splicing events. These include events previously characterized in the context of myotonic dystrophy type 1 and epithelial-to-mesenchymal transition, as well as splicing changes in genes related to proposed mechanisms of FECD pathogenesis. We report the first instance of RNA toxicity and missplicing in a common non-neurological/neuromuscular disease associated with a repeat expansion. The FECD patient population with this (CTG·CAG)n trinucleotide repeat expansion exceeds that of the combined number of patients in all other microsatellite expansion disorders.
Background: Expansion of intronic (CTG·CAG)n repeats in TCF4 is found in most Fuchs endothelial corneal dystrophy (FECD) patients.
Results: RNA foci co-localizing with the splicing factor MBNL1 are found in FECD cells, and changes in mRNA splicing occur.
Conclusion: Trinucleotide repeat expansion in FECD is associated with RNA focus formation and missplicing.
Significance: RNA toxicity occurs in a disease affecting millions of patients.
The strongest genetic association with Fuchs' endothelial corneal dystrophy (FECD) is the presence of an intronic (CTG·CAG)n trinucleotide repeat (TNR) expansion in the transcription factor 4 (TCF4) ...gene. Repeat-associated non-ATG (RAN) translation, an unconventional protein translation mechanism that does not require an initiating ATG, has been described in many TNR expansion diseases, including myotonic dystrophy type 1 (DM1). Given the similarities between DM1 and FECD, we wished to determine whether RAN translation occurs in FECD.
Antibodies against peptides in the C-terminus of putative RAN translation products from TCF4 were raised and validated by Western blotting and immunofluorescence (IF). CTG·CAG repeats of various lengths in the context of the TCF4 gene were cloned in frame with a 3× FLAG tag and transfected in human cells. IF with antipeptide and anti-FLAG antibodies, as well as cytotoxicity and cell proliferation assays, were performed in these transfected cells. Corneal endothelium derived from patients with FECD was probed with validated antibodies by IF.
CTG·CAG repeats in the context of the TCF4 gene are transcribed and translated via non-ATG initiation in transfected cells and confer toxicity to an immortalized corneal endothelial cell line. An antipeptide antibody raised against the C-terminus of the TCF4 poly-cysteine frame recognized RAN translation products by IF in cells transfected with CTG·CAG repeats and in FECD corneal endothelium.
Expanded CTG·CAG repeats in the context of the third intron of TCF4 are transcribed and translated via non-ATG initiation, providing evidence for RAN translation in corneal endothelium of patients with FECD.
Fuchs Endothelial Corneal Dystrophy (FECD) is a late onset, autosomal dominant eye disease that can lead to loss of vision. Expansion of a CTG trinucleotide repeat in the third intron of the ...transcription factor 4 (TCF4) gene is highly associated with FECD. However, only about 75% of FECD patients in the northern European population possess an expansion of this repeat. The remaining FECD cases appear to be associated with variants in other genes. To better understand the pathophysiology of this disease, we compared gene expression profiles of corneal endothelium from FECD patients with an expanded trinucleotide repeat (RE+) to those that do not have a repeat expansion (RE-). Comparative analysis of these two cohorts showed widespread RNA mis-splicing in RE+, but not in RE- samples. Quantitatively, we identified 39 genes in which expression was significantly different between RE+ and RE- samples. Examination of the mutation profiles in the RE- samples did not find any mutations in genes previously associated with FECD, but did reveal one sample with a rare variant of laminin subunit gamma 1 (LAMC1) and three samples with rare variants in the gene coding for the mitochondrial protein peripheral-type benzodiazepine receptor-associated protein 1 (TSPOAP1).
MicroRNAs play a role in regulating diverse biological processes and have considerable utility as molecular markers for diagnosis and monitoring of human disease. Several technologies are available ...commercially for measuring microRNA expression. However, cross-platform comparisons do not necessarily correlate well, making it difficult to determine which platform most closely represents the true microRNA expression level in a tissue. To address this issue, we have analyzed RNA derived from cell lines, as well as fresh frozen and formalin-fixed paraffin embedded tissues, using Affymetrix, Agilent, and Illumina microRNA arrays, NanoString counting, and Illumina Next Generation Sequencing. We compared the performance within- and between the different platforms, and then verified these results with those of quantitative PCR data. Our results demonstrate that the within-platform reproducibility for each method is consistently high and although the gene expression profiles from each platform show unique traits, comparison of genes that were commonly detectable showed that detection of microRNA transcripts was similar across multiple platforms.