People who are engineering biological organisms often find it useful to communicate in diagrams, both about the structure of the nucleic acid sequences that they are engineering and about the ...functional relationships between sequence features and other molecular species . Some typical practices and conventions have begun to emerge for such diagrams. The Synthetic Biology Open Language Visual (SBOL Visual) has been developed as a standard for organizing and systematizing such conventions in order to produce a coherent language for expressing the structure and function of genetic designs. This document details version 2.1 of SBOL Visual, which builds on the prior SBOL Visual 2.0 standard by expanding diagram syntax to include methods for showing modular structure and mappings between elements of a system, interactions arrows that can split or join (with the glyph at the split or join indicating either superposition or a chemical process), and adding new glyphs for indicating genomic context (e.g., integration into a plasmid or genome) and for stop codons.
People who are engineering biological organisms often find it useful to communicate in diagrams, both about the structure of the nucleic acid sequences that they are engineering and about the ...functional relationships between sequence features and other molecular species. Some typical practices and conventions have begun to emerge for such diagrams. The Synthetic Biology Open Language Visual (SBOL Visual) has been developed as a standard for organizing and systematizing such conventions in order to produce a coherent language for expressing the structure and function of genetic designs. This document details version 2.0 of SBOL Visual, which builds on the prior SBOL Visual 1.0 standard by expanding diagram syntax to include functional interactions and molecular species, making the relationship between diagrams and the SBOL data model explicit, supporting families of symbol variants, clarifying a number of requirements and best practices, and significantly expanding the collection of diagram glyphs.
Drug development is expensive and prone to failure. It is potentially much less risky and expensive to reuse a drug developed for one condition for treating a second disease, than it is to develop an ...entirely new compound. Systematic approaches to drug repositioning are needed to increase throughput and find candidates more reliably. Here we address this need with an integrated systems biology dataset, developed using the Ondex data integration platform, for the in silico discovery of new drug repositioning candidates. We demonstrate that the information in this dataset allows known repositioning examples to be discovered. We also propose a means of automating the search for new treatment indications of existing compounds.
The use of microorganisms for the production of industrially important compounds and enzymes is becoming increasingly important. Eukaryotes have been less widely used than prokaryotes in ...biotechnology, because of the complexity of their genomic structure and biology. The Yeast2.0 project is an international effort to engineer the yeast Saccharomyces cerevisiae to make it easy to manipulate, and to generate random variants using a system called SCRaMbLE. SCRaMbLE relies on artificial evolution in vitro to identify useful variants, an approach which is time consuming and expensive. We developed an in silico simulator for the SCRaMbLE system, using an evolutionary computing approach, which can be used to investigate and optimize the fitness landscape of the system. We applied the system to the investigation of the fitness landscape of one of the S. saccharomyces chromosomes, and found that our results fitted well with those previously published. We then simulated directed evolution with or without manipulation of SCRaMbLE, and revealed that controlling the SCRaMbLE process could effectively impact directed evolution. Our simulator can be applied to the analysis of the fitness landscapes of any organism for which SCRaMbLE has been implemented.
Interactome analyses have traditionally been applied to yeast, human and other model organisms due to the availability of protein-protein interaction data for these species. Recently, these ...techniques have been applied to more diverse species using computational interaction prediction from genome sequence and other data types. This review describes the various types of computational interactome networks that can be created and how they have been used in diverse eukaryotic species, highlighting some of the key interactome studies in non-model organisms.
As high-throughput technologies become cheaper and easier to use, raw sequence data and corresponding annotations for many organisms are becoming available. However, sequence data alone is not ...sufficient to explain the biological behaviour of organisms, which arises largely from complex molecular interactions. There is a need to develop new platform technologies that can be applied to the investigation of whole-genome datasets in an efficient and cost-effective manner. One such approach is the transfer of existing knowledge from well-studied organisms to closely-related organisms. In this paper, we describe a system, BacillusRegNet, for the use of a model organism, Bacillus subtilis, to infer genome-wide regulatory networks in less well-studied close relatives. The putative transcription factors, their binding sequences and predicted promoter sequences along with annotations are available from the associated BacillusRegNet website (http://bacillus.ncl.ac.uk).
The Human Proteome Organisation's Proteomics Standards Initiative has developed the GelML (gel electrophoresis markup language) data exchange format for representing gel electrophoresis experiments ...performed in proteomics investigations. The format closely follows the reporting guidelines for gel electrophoresis, which are part of the Minimum Information About a Proteomics Experiment (MIAPE) set of modules. GelML supports the capture of metadata (such as experimental protocols) and data (such as gel images) resulting from gel electrophoresis so that laboratories can be compliant with the MIAPE Gel Electrophoresis guidelines, while allowing such data sets to be exchanged or downloaded from public repositories. The format is sufficiently flexible to capture data from a broad range of experimental processes, and complements other PSI formats for MS data and the results of protein and peptide identifications to capture entire gel-based proteome workflows. GelML has resulted from the open standardisation process of PSI consisting of both public consultation and anonymous review of the specifications.
The creation of accurate quantitative Systems Biology Markup Language (SBML) models is a time-intensive, manual process often complicated by the many data sources and formats required to annotate ...even a small and well-scoped model. Ideally, the retrieval and integration of biological knowledge for model annotation should be performed quickly, precisely, and with a minimum of manual effort.
Here we present rule-based mediation, a method of semantic data integration applied to systems biology model annotation. The heterogeneous data sources are first syntactically converted into ontologies, which are then aligned to a small domain ontology by applying a rule base. We demonstrate proof-of-principle of this application of rule-based mediation using off-the-shelf semantic web technology through two use cases for SBML model annotation. Existing tools and technology provide a framework around which the system is built, reducing development time and increasing usability.
Integrating resources in this way accommodates multiple formats with different semantics, and provides richly-modelled biological knowledge suitable for annotation of SBML models. This initial work establishes the feasibility of rule-based mediation as part of an automated SBML model annotation system.
Detailed information on the project files as well as further information on and comparisons with similar projects is available from the project page at http://cisban-silico.cs.ncl.ac.uk/RBM/.
The rapid and cost-effective identification of bacterial species is crucial, especially for clinical diagnosis and treatment. Peptide aptamers have been shown to be valuable for use as a component of ...novel, direct detection methods. These small peptides have a number of advantages over antibodies, including greater specificity and longer shelf life. These properties facilitate their use as the detector components of biosensor devices. However, the identification of suitable aptamer targets for particular groups of organisms is challenging. We present a semi-automated processing pipeline for the identification of candidate aptamer targets from whole bacterial genome sequences. The pipeline can be configured to search for protein sequence fragments that uniquely identify a set of strains of interest. The system is also capable of identifying additional organisms that may be of interest due to their possession of protein fragments in common with the initial set. Through the use of Cloud computing technology and distributed databases, our system is capable of scaling with the rapidly growing genome repositories, and consequently of keeping the resulting data sets up-to-date. The system described is also more generically applicable to the discovery of specific targets for other diagnostic approaches such as DNA probes, PCR primers and antibodies.