RNA is considered as an attractive target for new small molecule drugs. Designing active compounds can be facilitated by computational modeling. Most of the available tools developed for these ...prediction purposes, such as molecular docking or scoring functions, are parametrized for protein targets. The performance of these methods, when applied to RNA-ligand systems, is insufficient. To overcome these problems, we developed AnnapuRNA, a new knowledge-based scoring function designed to evaluate RNA-ligand complex structures, generated by any computational docking method. We also evaluated three main factors that may influence the structure prediction, i.e., the starting conformer of a ligand, the docking program, and the scoring function used. We applied the AnnapuRNA method for a post-hoc study of the recently published structures of the FMN riboswitch. Software is available at https://github.com/filipspl/AnnapuRNA.
Intrinsically unstructured proteins (IUPs) lack a well-defined three-dimensional structure. Some of them may assume a locally stable structure under specific conditions, e.g. upon interaction with ...another molecule, while others function in a permanently unstructured state. The discovery of IUPs challenged the traditional protein structure paradigm, which stated that a specific well-defined structure defines the function of the protein. As of December 2011, approximately 60 methods for computational prediction of protein disorder from sequence have been made publicly available. They are based on different approaches, such as utilizing evolutionary information, energy functions, and various statistical and machine learning methods.
Given the diversity of existing intrinsic disorder prediction methods, we decided to test whether it is possible to combine them into a more accurate meta-prediction method. We developed a method based on arbitrarily chosen 13 disorder predictors, in which the final consensus was weighted by the accuracy of the methods. We have also developed a disorder predictor GSmetaDisorder3D that used no third-party disorder predictors, but alignments to known protein structures, reported by the protein fold-recognition methods, to infer the potentially structured and unstructured regions. Following the success of our disorder predictors in the CASP8 benchmark, we combined them into a meta-meta predictor called GSmetaDisorderMD, which was the top scoring method in the subsequent CASP9 benchmark.
A series of disorder predictors described in this article is available as a MetaDisorder web server at http://iimcb.genesilico.pl/metadisorder/. Results are presented both in an easily interpretable, interactive mode and in a simple text format suitable for machine processing.
Abstract
The MODOMICS database has been, since 2006, a manually curated and centralized resource, storing and distributing comprehensive information about modified ribonucleosides. Originally, it ...only contained data on the chemical structures of modified ribonucleosides, their biosynthetic pathways, the location of modified residues in RNA sequences, and RNA-modifying enzymes. Over the years, prompted by the accumulation of new knowledge and new types of data, it has been updated with new information and functionalities. In this new release, we have created a catalog of RNA modifications linked to human diseases, e.g., due to mutations in genes encoding modification enzymes. MODOMICS has been linked extensively to RCSB Protein Data Bank, and sequences of experimentally determined RNA structures with modified residues have been added. This expansion was accompanied by including nucleotide 5′-monophosphate residues. We redesigned the web interface and upgraded the database backend. In addition, a search engine for chemically similar modified residues has been included that can be queried by SMILES codes or by drawing chemical molecules. Finally, previously available datasets of modified residues, biosynthetic pathways, and RNA-modifying enzymes have been updated. Overall, we provide users with a new, enhanced, and restyled tool for research on RNA modification. MODOMICS is available at https://iimcb.genesilico.pl/modomics/.
Protein-RNA interactions play fundamental roles in many biological processes. Understanding the molecular mechanism of protein-RNA recognition and formation of protein-RNA complexes is a major ...challenge in structural biology. Unfortunately, the experimental determination of protein-RNA complexes is tedious and difficult, both by X-ray crystallography and NMR. For many interacting proteins and RNAs the individual structures are available, enabling computational prediction of complex structures by computational docking. However, methods for protein-RNA docking remain scarce, in particular in comparison to the numerous methods for protein-protein docking.
We developed two medium-resolution, knowledge-based potentials for scoring protein-RNA models obtained by docking: the quasi-chemical potential (QUASI-RNP) and the Decoys As the Reference State potential (DARS-RNP). Both potentials use a coarse-grained representation for both RNA and protein molecules and are capable of dealing with RNA structures with posttranscriptionally modified residues. We compared the discriminative power of DARS-RNP and QUASI-RNP for selecting rigid-body docking poses with the potentials previously developed by the Varani and Fernandez groups.
In both bound and unbound docking tests, DARS-RNP showed the highest ability to identify native-like structures. Python implementations of DARS-RNP and QUASI-RNP are freely available for download at http://iimcb.genesilico.pl/RNP/
Abstract
MODOMICS is a database of RNA modifications that provides comprehensive information concerning the chemical structures of modified ribonucleosides, their biosynthetic pathways, the location ...of modified residues in RNA sequences, and RNA-modifying enzymes. In the current database version, we included the following new features and data: extended mass spectrometry and liquid chromatography data for modified nucleosides; links between human tRNA sequences and MINTbase - a framework for the interactive exploration of mitochondrial and nuclear tRNA fragments; new, machine-friendly system of unified abbreviations for modified nucleoside names; sets of modified tRNA sequences for two bacterial species, updated collection of mammalian tRNA modifications, 19 newly identified modified ribonucleosides and 66 functionally characterized proteins involved in RNA modification. Data from MODOMICS have been linked to the RNAcentral database of RNA sequences. MODOMICS is available at http://modomics.genesilico.pl.
The spliceosome is a molecular machine that performs the excision of introns from eukaryotic pre-mRNAs. This macromolecular complex comprises in human cells five RNAs and over one hundred proteins. ...In recent years, many spliceosomal proteins have been found to exhibit intrinsic disorder, that is to lack stable native three-dimensional structure in solution. Building on the previous body of proteomic, structural and functional data, we have carried out a systematic bioinformatics analysis of intrinsic disorder in the proteome of the human spliceosome. We discovered that almost a half of the combined sequence of proteins abundant in the spliceosome is predicted to be intrinsically disordered, at least when the individual proteins are considered in isolation. The distribution of intrinsic order and disorder throughout the spliceosome is uneven, and is related to the various functions performed by the intrinsic disorder of the spliceosomal proteins in the complex. In particular, proteins involved in the secondary functions of the spliceosome, such as mRNA recognition, intron/exon definition and spliceosomal assembly and dynamics, are more disordered than proteins directly involved in assisting splicing catalysis. Conserved disordered regions in spliceosomal proteins are evolutionarily younger and less widespread than ordered domains of essential spliceosomal proteins at the core of the spliceosome, suggesting that disordered regions were added to a preexistent ordered functional core. Finally, the spliceosomal proteome contains a much higher amount of intrinsic disorder predicted to lack secondary structure than the proteome of the ribosome, another large RNP machine. This result agrees with the currently recognized different functions of proteins in these two complexes.
Abstract
SARS-CoV-2 is a betacoronavirus with a linear single-stranded, positive-sense RNA genome, whose outbreak caused the ongoing COVID-19 pandemic. The ability of coronaviruses to rapidly evolve, ...adapt, and cross species barriers makes the development of effective and durable therapeutic strategies a challenging and urgent need. As for other RNA viruses, genomic RNA structures are expected to play crucial roles in several steps of the coronavirus replication cycle. Despite this, only a handful of functionally-conserved coronavirus structural RNA elements have been identified to date. Here, we performed RNA structure probing to obtain single-base resolution secondary structure maps of the full SARS-CoV-2 coronavirus genome both in vitro and in living infected cells. Probing data recapitulate the previously described coronavirus RNA elements (5′ UTR and s2m), and reveal new structures. Of these, ∼10.2% show significant covariation among SARS-CoV-2 and other coronaviruses, hinting at their functionally-conserved role. Secondary structure-restrained 3D modeling of these segments further allowed for the identification of putative druggable pockets. In addition, we identify a set of single-stranded segments in vivo, showing high sequence conservation, suitable for the development of antisense oligonucleotide therapeutics. Collectively, our work lays the foundation for the development of innovative RNA-targeted therapeutic strategies to fight SARS-related infections.
Abstract
Nucleic acid cleaving DNAzymes are versatile and robust catalysts that outcompete ribozymes and protein enzymes in terms of chemical stability, affordability and ease to synthesize. In spite ...of their attractiveness, the choice of which DNAzyme should be used to cleave a given substrate is far from obvious, and requires expert knowledge as well as in-depth literature scrutiny. DNAzymeBuilder enables fast and automatic assembly of DNAzymes for the first time, superseding the manual design of DNAzymes. DNAzymeBuilder relies on an internal database with information on RNA and DNA cleaving DNAzymes, including the reaction conditions under which they best operate, their kinetic parameters, the type of cleavage reaction that is catalyzed, the specific sequence that is recognized by the DNAzyme, the cleavage site within this sequence, and special design features that might be necessary for optimal activity of the DNAzyme. Based on this information and the input sequence provided by the user, DNAzymeBuilder provides a list of DNAzymes to carry out the cleavage reaction and detailed information for each of them, including the expected yield, reaction products and optimal reaction conditions. DNAzymeBuilder is a resource to help researchers introduce DNAzymes in their day-to-day research, and is publicly available at https://iimcb.genesilico.pl/DNAzymeBuilder.
Graphical Abstract
Graphical Abstract
DNAzymeBuilder assembles nucleic acid cleaving DNAzymes for site-specific cleavage of RNA, DNA or chimeric substrates and provides detailed information on reaction conditions and products.
Protein-RNA and protein-DNA interactions play fundamental roles in many biological processes. A detailed understanding of these interactions requires knowledge about protein-nucleic acid complex ...structures. Because the experimental determination of these complexes is time-consuming and perhaps futile in some instances, we have focused on computational docking methods starting from the separate structures. Docking methods are widely employed to study protein-protein interactions; however, only a few methods have been made available to model protein-nucleic acid complexes. Here, we describe NPDock (Nucleic acid-Protein Docking); a novel web server for predicting complexes of protein-nucleic acid structures which implements a computational workflow that includes docking, scoring of poses, clustering of the best-scored models and refinement of the most promising solutions. The NPDock server provides a user-friendly interface and 3D visualization of the results. The smallest set of input data consists of a protein structure and a DNA or RNA structure in PDB format. Advanced options are available to control specific details of the docking process and obtain intermediate results. The web server is available at http://genesilico.pl/NPDock.
MODOMICS is a database of RNA modifications that provides comprehensive information concerning the chemical structures of modified ribonucleosides, their biosynthetic pathways, RNA-modifying enzymes ...and location of modified residues in RNA sequences. In the current database version, accessible at http://modomics.genesilico.pl, we included new features: a census of human and yeast snoRNAs involved in RNA-guided RNA modification, a new section covering the 5'-end capping process, and a catalogue of 'building blocks' for chemical synthesis of a large variety of modified nucleosides. The MODOMICS collections of RNA modifications, RNA-modifying enzymes and modified RNAs have been also updated. A number of newly identified modified ribonucleosides and more than one hundred functionally and structurally characterized proteins from various organisms have been added. In the RNA sequences section, snRNAs and snoRNAs with experimentally mapped modified nucleosides have been added and the current collection of rRNA and tRNA sequences has been substantially enlarged. To facilitate literature searches, each record in MODOMICS has been cross-referenced to other databases and to selected key publications. New options for database searching and querying have been implemented, including a BLAST search of protein sequences and a PARALIGN search of the collected nucleic acid sequences.