Abstract
With the rapid increase of sequenced metazoan mitochondrial genomes, a detailed manual annotation is becoming more and more infeasible. While it is easy to identify the approximate location ...of protein-coding genes within mitogenomes, the peculiar processing of mitochondrial transcripts, however, makes the determination of precise gene boundaries a surprisingly difficult problem. We have analyzed the properties of annotated start and stop codon positions in detail, and use the inferred patterns to devise a new method for predicting gene boundaries in de novo annotations. Our method benefits from empirically observed prevalances of start/stop codons and gene lengths, and considers the dependence of these features on variations of genetic codes. Albeit not being perfect, our new approach yields a drastic improvement in the accuracy of gene boundaries and upgrades the mitochondrial genome annotation server MITOS to an even more sophisticated tool for fully automatic annotation of metazoan mitochondrial genomes.
Orthology analysis is an important part of data analysis in many areas of bioinformatics such as comparative genomics and molecular phylogenetics. The ever-increasing flood of sequence data, and ...hence the rapidly increasing number of genomes that can be compared simultaneously, calls for efficient software tools as brute-force approaches with quadratic memory requirements become infeasible in practise. The rapid pace at which new data become available, furthermore, makes it desirable to compute genome-wide orthology relations for a given dataset rather than relying on relations listed in databases.
The program Proteinortho described here is a stand-alone tool that is geared towards large datasets and makes use of distributed computing techniques when run on multi-core hardware. It implements an extended version of the reciprocal best alignment heuristic. We apply Proteinortho to compute orthologous proteins in the complete set of all 717 eubacterial genomes available at NCBI at the beginning of 2009. We identified thirty proteins present in 99% of all bacterial proteomes.
Proteinortho significantly reduces the required amount of memory for orthology analysis compared to existing tools, allowing such computations to be performed on off-the-shelf hardware.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The chromosome 9p21 (Chr9p21) locus of coronary artery disease has been identified in the first surge of genome-wide association and is the strongest genetic factor of atherosclerosis known today. ...Chr9p21 encodes the long non-coding RNA (ncRNA) antisense non-coding RNA in the INK4 locus (ANRIL). ANRIL expression is associated with the Chr9p21 genotype and correlated with atherosclerosis severity. Here, we report on the molecular mechanisms through which ANRIL regulates target-genes in trans, leading to increased cell proliferation, increased cell adhesion and decreased apoptosis, which are all essential mechanisms of atherogenesis. Importantly, trans-regulation was dependent on Alu motifs, which marked the promoters of ANRIL target genes and were mirrored in ANRIL RNA transcripts. ANRIL bound Polycomb group proteins that were highly enriched in the proximity of Alu motifs across the genome and were recruited to promoters of target genes upon ANRIL over-expression. The functional relevance of Alu motifs in ANRIL was confirmed by deletion and mutagenesis, reversing trans-regulation and atherogenic cell functions. ANRIL-regulated networks were confirmed in 2280 individuals with and without coronary artery disease and functionally validated in primary cells from patients carrying the Chr9p21 risk allele. Our study provides a molecular mechanism for pro-atherogenic effects of ANRIL at Chr9p21 and suggests a novel role for Alu elements in epigenetic gene regulation by long ncRNAs.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The detection of differentially methylated regions (DMRs) is a necessary prerequisite for characterizing different epigenetic states. We present a novel program, metilene, to identify DMRs within ...whole-genome and targeted data with unrivaled specificity and sensitivity. A binary segmentation algorithm combined with a two-dimensional statistical test allows the detection of DMRs in large methylation experiments with multiple groups of samples in minutes rather than days using off-the-shelf hardware. metilene outperforms other state-of-the-art tools for low coverage data and can estimate missing data. Hence, metilene is a versatile tool to study the effect of epigenetic modifications in differentiation/development, tumorigenesis, and systems biology on a global, genome-wide level. Whether in the framework of international consortia with dozens of samples per group, or even without biological replicates, it produces highly significant and reliable results.
Display omitted
► The organisation of mitochondrial genomes is similar but not identical among animals. ► Several different rearrangement operations shape animal mitogenomes. ► Differences in genome ...structure are reflected in the mito-transcriptome.
Many years of extensive studies of metazoan mitochondrial genomes have established differences in gene arrangements and genetic codes as valuable phylogenetic markers. Understanding the underlying mechanisms of replication, transcription and the role of the control regions which cause e.g. different gene orders is important to assess the phylogenetic signal of such events. This review summarises and discusses, for the Metazoa, the general aspects of mitochondrial transcription and replication with respect to control regions as well as several proposed models of gene rearrangements. As whole genome sequencing projects accumulate, more and more observations about mitochondrial gene transfer to the nucleus are reported. Thus occurrence and phylogenetic aspects concerning nuclear mitochondrial-like sequences (NUMTS) is another aspect of this review.
Genome sequencing of Helicobacter pylori has revealed the potential proteins and genetic diversity of this prevalent human pathogen, yet little is known about its transcriptional organization and ...noncoding RNA output. Massively parallel cDNA sequencing (RNA-seq) has been revolutionizing global transcriptomic analysis. Here, using a novel differential approach (dRNA-seq) selective for the 5' end of primary transcripts, we present a genome-wide map of H. pylori transcriptional start sites and operons. We discovered hundreds of transcriptional start sites within operons, and opposite to annotated genes, indicating that complexity of gene expression from the small H. pylori genome is increased by uncoupling of polycistrons and by genome-wide antisense transcription. We also discovered an unexpected number of approximately 60 small RNAs including the epsilon-subdivision counterpart of the regulatory 6S RNA and associated RNA products, and potential regulators of cis- and trans-encoded target messenger RNAs. Our approach establishes a paradigm for mapping and annotating the primary transcriptomes of many living species.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
It is widely assumed that one of the fundamental properties of spoken language is the arbitrary relation between sound and meaning. Some exceptions in the form of nonarbitrary associations have been ...documented in linguistics, cognitive science, and anthropology, but these studies only involved small subsets of the 6,000+ languages spoken in the world today. By analyzing word lists covering nearly two-thirds of the world’s languages, we demonstrate that a considerable proportion of 100 basic vocabulary items carry strong associations with specific kinds of human speech sounds, occurring persistently across continents and linguistic lineages (linguistic families or isolates). Prominently among these relations, we find property words (“small” and i, “full” and p or b) and body part terms (“tongue” and l, “nose” and n). The areal and historical distribution of these associations suggests that they often emerge independently rather than being inherited or borrowed. Our results therefore have important implications for the language sciences, given that nonarbitrary associations have been proposed to play a critical role in the emergence of cross-modal mappings, the acquisition of language, and the evolution of our species’ unique communication system.
Display omitted
► High quality de novo annotation of Metazoan mitochondrial genomes. ► MITOS is available as fully automatic web server. ► Consistent reannotation of available mitogenomes.
About 2000 ...completely sequenced mitochondrial genomes are available from the NCBI RefSeq data base together with manually curated annotations of their protein-coding genes, rRNAs, and tRNAs. This annotation information, which has accumulated over two decades, has been obtained with a diverse set of computational tools and annotation strategies. Despite all efforts of manual curation it is still plagued by misassignments of reading directions, erroneous gene names, and missing as well as false positive annotations in particular for the RNA genes. Taken together, this causes substantial problems for fully automatic pipelines that aim to use these data comprehensively for studies of animal phylogenetics and the molecular evolution of mitogenomes. The MITOS pipeline is designed to compute a consistent de novo annotation of the mitogenomic sequences. We show that the results of MITOS match RefSeq and MitoZoa in terms of annotation coverage and quality. At the same time we avoid biases, inconsistencies of nomenclature, and typos originating from manual curation strategies. The MITOS pipeline is accessible online at http://mitos.bioinf.uni-leipzig.de.
ViennaRNA Package 2.0 Lorenz, Ronny; Bernhart, Stephan H; Höner Zu Siederdissen, Christian ...
Algorithms for molecular biology,
11/2011, Letnik:
6, Številka:
1
Journal Article
Recenzirano
Odprti dostop
Secondary structure forms an important intermediate level of description of nucleic acids that encapsulates the dominating part of the folding energy, is often well conserved in evolution, and is ...routinely used as a basis to explain experimental findings. Based on carefully measured thermodynamic parameters, exact dynamic programming algorithms can be used to compute ground states, base pairing probabilities, as well as thermodynamic properties.
The ViennaRNA Package has been a widely used compilation of RNA secondary structure related computer programs for nearly two decades. Major changes in the structure of the standard energy model, the Turner 2004 parameters, the pervasive use of multi-core CPUs, and an increasing number of algorithmic variants prompted a major technical overhaul of both the underlying RNAlib and the interactive user programs. New features include an expanded repertoire of tools to assess RNA-RNA interactions and restricted ensembles of structures, additional output information such as centroid structures and maximum expected accuracy structures derived from base pairing probabilities, or z-scores for locally stable secondary structures, and support for input in fasta format. Updates were implemented without compromising the computational efficiency of the core algorithms and ensuring compatibility with earlier versions.
The ViennaRNA Package 2.0, supporting concurrent computations via OpenMP, can be downloaded from http://www.tbi.univie.ac.at/RNA.
RNA aptamers readily recognize small organic molecules, polypeptides, as well as other nucleic acids in a highly specific manner. Many such aptamers have evolved as parts of regulatory systems in ...nature. Experimental selection techniques such as SELEX have been very successful in finding artificial aptamers for a wide variety of natural and synthetic ligands. Changes in structure and/or stability of aptamers upon ligand binding can propagate through larger RNA constructs and cause specific structural changes at distal positions. In turn, these may affect transcription, translation, splicing, or binding events. The RNA secondary structure model realistically describes both thermodynamic and kinetic aspects of RNA structure formation and refolding at a single, consistent level of modelling. Thus, this framework allows studying the function of natural riboswitches in silico. Moreover, it enables rationally designing artificial switches, combining essentially arbitrary sensors with a broad choice of read-out systems. Eventually, this approach sets the stage for constructing versatile biosensors.