Long noncoding RNAs (lncRNAs) are a recently discovered class of non-protein coding RNAs, which have now increasingly been shown to be involved in a wide variety of biological processes as regulatory ...molecules. The functional role of many of the members of this class has been an enigma, except a few of them like Malat and HOTAIR. Little is known regarding the regulatory interactions between noncoding RNA classes. Recent reports have suggested that lncRNAs could potentially interact with other classes of non-coding RNAs including microRNAs (miRNAs) and modulate their regulatory role through interactions. We hypothesized that lncRNAs could participate as a layer of regulatory interactions with miRNAs. The availability of genome-scale datasets for Argonaute targets across human transcriptome has prompted us to reconstruct a genome-scale network of interactions between miRNAs and lncRNAs.
We used well characterized experimental Photoactivatable-Ribonucleoside-Enhanced Crosslinking and Immunoprecipitation (PAR-CLIP) datasets and the recent genome-wide annotations for lncRNAs in public domain to construct a comprehensive transcriptome-wide map of miRNA regulatory elements. Comparative analysis revealed that in addition to targeting protein-coding transcripts, miRNAs could also potentially target lncRNAs, thus participating in a novel layer of regulatory interactions between noncoding RNA classes. Furthermore, we have modeled one example of miRNA-lncRNA interaction using a zebrafish model. We have also found that the miRNA regulatory elements have a positional preference, clustering towards the mid regions and 3' ends of the long noncoding transcripts. We also further reconstruct a genome-wide map of miRNA interactions with lncRNAs as well as messenger RNAs.
This analysis suggests widespread regulatory interactions between noncoding RNAs classes and suggests a novel functional role for lncRNAs. We also present the first transcriptome scale study on miRNA-lncRNA interactions and the first report of a genome-scale reconstruction of a noncoding RNA regulatory interactome involving lncRNAs.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
A major fraction of the transcriptome of higher organisms comprised an extensive repertoire of long non-coding RNA (lncRNA) which express in a cell type and development stage-specific manner. While ...lncRNAs are a proven component of epigenetic gene expression modulation, epigenetic regulation of lncRNA itself remains poorly understood. Here we have analysed pan-genomic DNA methylation and histone modification marks (H3K4me3, H3K9me3, H3K27me3 and H3K36me3) associated with transcription start site (TSS) of lncRNA in four different cell types and three different tissue types representing various cellular stages. We observe that histone marks associated with active transcription H3K4me3 and H3K36me3 along with the repressive histone mark H3K27me3 have similar distribution pattern around TSS irrespective of cell types. Also, the density of these marks correlates well with expression of protein-coding and lncRNA genes. In contrast, the lncRNA genes harbour higher methylation density around TSS than protein-coding genes regardless of their expression status. Furthermore, we found that DNA methylation along with the other repressive histone mark H3K9me3 does not seem to play a role in lncRNA expression. Thus, our observation suggests that epigenetic regulation of lncRNA shares common features with mRNA except the role of DNA methylation which is markedly dissimilar.
Middle East and North Africa (MENA) encompass very unique populations, with a rich history and encompasses characteristic ethnic, linguistic and genetic diversity. The genetic diversity of MENA ...region has been largely unknown. The recent availability of whole-exome and whole-genome sequences from the region has made it possible to collect population-specific allele frequencies. The integration of data sets from this region would provide insights into the landscape of genetic variants in this region. We integrated genetic variants from multiple data sets systematically, available from this region to create a compendium of over 26 million genetic variations. The variants were systematically annotated and their allele frequencies in the data sets were computed and available as a web interface which enables quick query. As a proof of principle for application of the compendium for genetic epidemiology, we analyzed the allele frequencies for variants in transglutaminase 1 (TGM1) gene, associated with autosomal recessive lamellar ichthyosis. Our analysis revealed that the carrier frequency of selected variants differed widely with significant interethnic differences. To the best of our knowledge, al mena is the first and most comprehensive repertoire of genetic variations from the Arab, Middle Eastern and North African region. We hope al mena would accelerate Precision Medicine in the region.
Long non-coding RNAs (lncRNAs) form the largest class of non-protein coding genes in the human genome. While a small subset of well-characterized lncRNAs has demonstrated their significant role in ...diverse biological functions like chromatin modifications, post-transcriptional regulation, imprinting etc., the functional significance of a vast majority of them still remains an enigma. Increasing evidence of the implications of lncRNAs in various diseases including cancer and major developmental processes has further enhanced the need to gain mechanistic insights into the lncRNA functions. Here, we present a comprehensive review of the various computational approaches and tools available for the identification and annotation of long non-coding RNAs. We also discuss a conceptual roadmap to systematically explore the functional properties of the lncRNAs using computational approaches.
Only a handful of long noncoding RNAs have been functionally characterized. They are known to modulate regulation through interacting with other biomolecules in the cell: DNA, RNA and protein. Though ...there have been detailed investigations on lncRNA-miRNA and lncRNA-protein interactions, the interaction of lncRNAs with DNA have not been studied extensively. In the present study, we explore whether lncRNAs could modulate genomic regulation by interacting with DNA through the formation of highly stable DNA:DNA:RNA triplexes.
We computationally screened 23,898 lncRNA transcripts as annotated by GENCODE, across the human genome for potential triplex forming sequence stretches (PTS). The PTS frequencies were compared across 5'UTR, CDS, 3'UTR, introns, promoter and 1000 bases downstream of the transcription termination sites. These regions were annotated by mapping to experimental regulatory regions, classes of repeat regions and transcription factors. We validated few putative triplex mediated interactions where lncRNA-gene pair interaction is via pyrimidine triplex motif using biophysical methods.
We identified 20,04,034 PTS sites to be enriched in promoter and intronic regions across human genome. Additional analysis of the association of PTS with core promoter elements revealed a systematic paucity of PTS in all regulatory regions, except TF binding sites. A total of 25 transcription factors were found to be associated with PTS. Using an interaction network, we showed that a subset of the triplex forming lncRNAs, have a positive association with gene promoters. We also demonstrated an in vitro interaction of one lncRNA candidate with its predicted gene target promoter regions.
Our analysis shows that PTS are enriched in gene promoter and largely associated with simple repeats. The current study suggests a major role of a subset of lncRNAs in mediating chromatin organization modulation through CTCF and NSRF proteins.
Advances in Next Generation Sequencing have made rapid variant discovery and detection widely accessible. To facilitate a better understanding of the nature of these variants, American College of ...Medical Genetics and Genomics and the Association of Molecular Pathologists (ACMG-AMP) have issued a set of guidelines for variant classification. However, given the vast number of variants associated with any disorder, it is impossible to manually apply these guidelines to all known variants. Machine learning methodologies offer a rapid way to classify large numbers of variants, as well as variants of uncertain significance as either pathogenic or benign. Here we classify ATP7B genetic variants by employing ML and AI algorithms trained on our well-annotated WilsonGen dataset.
We have trained and validated two algorithms: TabNet and XGBoost on a high-confidence dataset of manually annotated, ACMG & AMP classified variants of the ATP7B gene associated with Wilson's Disease.
Using an independent validation dataset of ACMG & AMP classified variants, as well as a patient set of functionally validated variants, we showed how both algorithms perform and can be used to classify large numbers of variants in clinical as well as research settings.
We have created a ready to deploy tool, that can classify variants linked with Wilson's disease as pathogenic or benign, which can be utilized by both clinicians and researchers to better understand the disease through the nature of genetic variants associated with it.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
miRNAs have emerged as important players in the regulation of gene expression and their deregulation is a common feature in a variety of diseases, especially cancer. Currently, many efforts are ...focused on studying miRNA expression patterns, as well as miRNA target validation. Here, we show that the over expression of miR-23a approximately 27a approximately 24-2 cluster in HEK293T cells induces apoptosis by caspase-dependent as well as caspase-independent pathway as proved by the annexin assay, caspase activation, release of cytochrome-c and AIF (apoptosis inducing factor) from mitochondria. Furthermore, the over expressed cluster modulates the expression of a number of genes involved in apoptosis including FADD (Fas Associated protein with Death Domain). Bioinformatically, FADD is predicted to be the target of hsa-miR-27a and interestingly, FADD protein was found to be up regulated consistent with very less expression of hsa-miR-27a in HEK293T cells. This effect was direct, as hsa-miR-27a negatively regulated the expression of FADD 3'UTR based reporter construct. Moreover, we also showed that over expression of miR-23a approximately 27a approximately 24-2 sensitized HEK293T cells to TNF-alpha cytotoxicity. Taken together, our study demonstrates that enhanced TNF-alpha induced apoptosis in HEK293T cells by over expression of miR-23a approximately 27a approximately 24-2 cluster provides new insights in the development of novel therapeutics for cancer.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
COVID-19, the disease caused by the novel SARS-CoV-2 coronavirus, originated as an isolated outbreak in the Hubei province of China but soon created a global pandemic and is now a major threat to ...healthcare systems worldwide. Following the rapid human-to-human transmission of the infection, institutes around the world have made efforts to generate genome sequence data for the virus. With thousands of genome sequences for SARS-CoV-2 now available in the public domain, it is possible to analyze the sequences and gain a deeper understanding of the disease, its origin, and its epidemiology. Phylogenetic analysis is a potentially powerful tool for tracking the transmission pattern of the virus with a view to aiding identification of potential interventions. Toward this goal, we have created a comprehensive protocol for the analysis and phylogenetic clustering of SARS-CoV-2 genomes using Nextstrain, a powerful open-source tool for the real-time interactive visualization of genome sequencing data. Approaches to focus the phylogenetic clustering analysis on a particular region of interest are detailed in this protocol.
Wilson disease (WD) is one of the most prevalent genetic diseases with an estimated global carrier frequency of 1 in 90 and a prevalence of 1 in 30,000. The disease owes its genesis to Kinnier Wilson ...who described the disease, and is caused by accumulation of Copper (Cu) in various organs including the liver, central nervous system, cornea, kidney, joints and cardiac muscle which contribute to the characteristic clinical features of WD. A number of studies have reported genetic variants in the ATP7B gene from diverse ethnic and geographical origins. The recent advent of next-generation sequencing approaches has also enabled the discovery of a large number of novel variants in the gene associated with the disease. Previous attempts have been made to compile the knowledgebase and spectrum of genetic variants from across the multitude of publications, but have been limited by the utility due to the significant differences in approaches used to qualify pathogenicity of variants in each of the publications. The recent formulation of guidelines and algorithms for assessment of the pathogenicity of variants jointly put forward by the American College of Medical Genetics and the Association of Molecular Pathologists (ACMG &) has provided a framework for evidence based and systematic assessment of pathogenicity of variants. In this paper, we describe a comprehensive resource of genetic variants in ATP7B gene manually curated from literature and data resources and systematically annotated using the ACMG & AMP guidelines for assessing pathogenicity. The resource therefore serves as a central point for clinicians and geneticists working on WD and to the best of our knowledge is the most comprehensive and only clinically annotated resource for WD. The resource is available at URL http://clingen.igib.res.in/WilsonGen/. We compiled a total of 3662 genetic variants from publications and databases associated with WD. Of these variants compiled, a total of 1458 were found to be unique entries. This is the largest WD database comprising 656 pathogenic/likely pathogenic variants reported classified according to ACMG & AMP guidelines. We also mapped all the pathogenic variants corresponding to ATP7B protein from literature and other databases. In addition, geographical origin and distribution of ATP7B pathogenic variants reported are also mapped in the database.